AI60213 - Foundation of Large Language Models

Language being a complex and intricate system of human expression have put forward a significant challenge to AI for its understanding and generation. Language models are the foundation of language understanding and task of language modelling has been evolved from statistical to neural models. In recent times, pre-trained language models (PLMs), founded on the Transformer architecture and trained on exceedingly large amount of data, has shown success in solving majority of the NLP tasks. The performance in various tasks has been observed to improve with the scale of the models. Consequently, several Large Language Models (LLMs) have been developed that not only can solve specific tasks but also has been observed to exhibit emergent behaviour. The LLMs have been adopted in other data modalities like vision, speech or multi-modal. Very recently, LLMs are being used to develop autonomous agents that can solve complex tasks. This course aims at providing foundational knowledge on key technologies for developing, leveraging and augmenting LLMs.

Instructor

Afshine Amidi
Plaban Kumar Bhowmick
plaban@ai.iitkgp.ac.in

Logistics

Time: Monday 11:0AM - 11:55AM, Tuesday 8:00AM - 9:55AM
Location: NR-213, Nalanda Classroom Complex
Credits: 3-0-0

Course Teaching Assistants

Afshine Amidi
Animesh
Teaching Assistant (Course Webpage Designer and Maintainer)
Office Hours:
Mondays 3-4 PM
Doubts about Content and Project
Afshine Amidi
Rekha Regar
Teaching Assistant
Office Hours:
Mondays 3-4 PM
Doubts about Assignments
Afshine Amidi
Trishita Mukherjee
Teaching Assistant
Office Hours:
Tuesday 3-4 PM
Doubts about Vlog

Course Objective

Upon completion of the course the students will be able to

  • choose, integrate and identify strategies for building LLMs
  • identify strategies and techniques for scaling up LLM development
  • use and compare prompt engineering techniques to solve complex tasks like reasoning.
  • evaluate the LLMs with setting up benchmarks
  • identify potential threats posed by LLMs and apply mitigation techniques like Retrieval Augmented Generation.
  • implement techniques to align LLM training with human goals and behaviors.
  • implement autonomous agents (agentic AI) that use LLMs to solve complex tasks.

Pre-Requisite

  • Machine Learning and Deep Learning

Grading Policy

The final grade will be calculated based on the following components:

Assignments

Programming assignments and problem sets

30%

Final Project

Research project with implementation and presentation

10%

Vlog

Video presentations and technical demonstrations

15%

Mid-semester Exam

Comprehensive examination covering first half

20%

End-semester Exam

Final comprehensive examination

25%

Additional Notes:

  • Late submission policy: 10% deduction per day for assignments
  • Final project includes code, report, and presentation components
  • Attendance and class participation may contribute bonus points

General Information

NOTE: We believe that ethics and social implications of LLMs are extremely important topics to discuss. As these topics are covered in the 'AI and Ethics' (CS60016) offered in the Dept. of CSE, this course does not include those.

Announcements