Analyzed cross-lingual knowledge representation in BLOOM-1.7B by computing cosine similarity across English, French, and Portuguese, achieving high average similarity (0.91-0.94) despite Portuguese having only 5.2% share in training data.
• Created a trilingual lexicon dataset with Gemini API, available on Hugging Face.
https://huggingface.co/datasets/MLap/English-French-Portuguese-Lexicon
• Conducted zero-shot cross-lingual transfer experiments by fine-tuning BLOOM on (High Resource
Language) English sentiment classification task and testing on Hindi (Low Resource Language) to
evaluate multilingual generalization capabilities.
• Created a SentiHin-2500 (Hindi Sentimental Analysis dataset), available on Hugging Face.
https://huggingface.co/datasets/MLap/SentiHin-2500
Visit GitHub repo
• Main feature: Haze removal from image/ video.
• Demo at 🤗: huggingface.co/spaces/MLap/deFogify
• This project implements a single-image haze removal technique using the Dark Channel Prior, as described in the research paper: ``Single Image Haze Removal Using Dark Channel Prior`
Visit GitHub repo
• 30+ ⭐ on GitHub
• Extracts links, text, and more from custom screenshots.
• Advantages: Requires "no internet", free to use.
• Integrated with Google Gemini through its API for query on extracted text. [This Requires Internet]
• Some features developed in '36 hours' at (IIT Dhanbad) Hack'24.
• Utilizes LSTM-based Tesseract OCR library for text recognition from images.
Visit GitHub repo