LLM Performance Researcher
LLM Peformance Metrics Researcher
Project Title:
Performance and efficacy of machine learning and artificial intelligence implementation in economics and life cycle system science decision support
Project Description:
The sponsor seeks a computer scientist (US citizen) to join their team in researching the performance and efficacy of machine learning and artificial intelligence (large language models - LLMs) across a variety of both research-oriented and public-facing software applications using standardized science-based metrics. The initial focus is on developing LLM-based outputs, and comparing their performance relative to manually developed outputs, including drafting annotated bibliographies and literature reviews, writing code, web applications that incorporate LLMs to enhance capabilities, and LLM-based web applications.
The ideal candidate will have a strong background in integrating LLM APIs, React, and front-end programming, and will be responsible for transforming models into usable APIs or integrated tools for production. They will also be responsible for monitoring, troubleshooting and enhancing model efficiency and scalability. The candidate should also be well-versed in handling data preprocessing, and analysis for model training. The candidate should be aware of various prompt engineering techniques, implementing intelligent prompt caching (e.g., Redis), understanding of vector stores (e.g., Pinecone), and efficient token management. The candidate should have knowledge of implementing AI security protocols, including guardrails and techniques to prevent prompt injection. The candidate should also have a working knowledge of RAG (Retrieval Augmented Generation). Additional research tasks may be assigned based on candidate’s skillset and priorities.
Key Responsibilities:
- Develop user interfaces for web application
- Assist with special software development projects as assigned
- Write and implement efficient code
- Work closely with other developers
- Statistically compare performance across code and tool designs
- Draft manuscripts documenting the methodology and results
Desired Qualifications:
- US Citizen
- Master’s degree in Computer Science or related field
- At least 2 years of professional experience
- At least 1 year as development team lead for at least one web application using the software stacks listed below
- Experience with state management in React (RxJS)
- Experience working on cloud technologies (AWS, Azure)
- Working knowledge of RAG (Retrieval Augmented Generation)
- Proficient with integrating one or more LLMs into applications (e.g., OpenAI, Gemini, Llama)
- Proficient with HTML, CSS, Typescript, React, and Python
- Proficient using any UI Component libraries (e.g., Ant Design, Material UI, etc.)
- Proficient with Node.js
- Working knowledge of building quick prototypes using Streamlit (or similar) and LLMs
- Proficient with JSON
- Proficient with Vite, Nginx, GitHub, Docker, and Portainer
- GPU programming or data visualization experience a plus
- Evidence of strong oral and written communication skills, including authorship on at least 2 technical publications)
- Strong logical thinking and problem solving
- Excellent attention to detail
Other Details:
- Full-time: the participant is expected to work 40 hours a week
- Location: the participant will work at the NIST Gaithersburg Campus.
- Duration: this is expected to be a one-year position. Extensions are sometimes granted depending on the availability of funds.