Qompass | LinkedIn
Cost Conscious AI Services is pleased to offer Convert your favorite LLM into an embedding model with LLM2Vec. Perform RAG with your model for generation and retrieval. Follow a simple 2-step process for conversion:
- Load the desired LLM into llm2vec package for conversion to an embedding model.
- Perform MNTP supervised or SIMCSE unsupervised training for better results on retrieval tasks.
I tried embedding model conversion on LLama-3 with 1xA100. Performed training on Cosmopedia 100k subset with MNTP.
Find out more here:
- š„ Colab Notebook: https://lnkd.in/dvCwbTQC
- š Paper: https://lnkd.in/d2JhcCEr
- š» Github: https://lnkd.in/dj-iwD5h
Announcing a new annotated data set on @huggingface of all U.S. bills - approximately 119,000 bills - since the 108th Congress. What makes this dataset special is that it includes labels (*policy area* and *legislative subjects*) that have been painstakingly annotated over the last 20 years by expert analysts at the Congressional Research Service of the Library of Congress.
Are you inspired to train a model to do better?
The essential question that is asked by this data set is "what is this bill 'about'", a challenge when selecting through thousands of subject areas to classify a bill that could be more than 500 pages long. This is a task that takes a great deal of *judgment*. Now, with this training set, it should be possible to apply traditional classification approaches, to fine-tune an LLM, or any number of ML approaches.
It is a relevant problem because the Library of Congress continues to label bills by hand, and automation could allow them to process bills a whole lot faster. If you're interested, grab the dataset from Huggingface, or get in touch to collaborate!
Find out more here: https://lnkd.in/gfeZ4tBn
Excited to announce Microsoft's collaboration with Seedrs to bring you SERIES AI; the disruptive AI and funding accelerator for Seed to Series A founders!
Key Dates:
- š Applications close: 10th May 2024
- š Accelerator commences: 13th May 2024
- š Pitching Demo Day: 17th June 2024
- š Pitching Demo Day Location: Microsoft Reactor in London Paddington
See you there!
Find out more and apply here: https://lnkd.in/eEArAqP2
Learn more by visiting the following links:
A quick look at what it means to be environmentally conscious, as seen from @meta AI's example set with their open reporting of CO2 emissions from the LLaMA3 training process.