AI60213 - Foundation of Large Language Models

Language being a complex and intricate system of human expression have put forward a significant challenge to AI for its understanding and generation. Language models are the foundation of language understanding and task of language modelling has been evolved from statistical to neural models. In recent times, pre-trained language models (PLMs), founded on the Transformer architecture and trained on exceedingly large amount of data, has shown success in solving majority of the NLP tasks. The performance in various tasks has been observed to improve with the scale of the models. Consequently, several Large Language Models (LLMs) have been developed that not only can solve specific tasks but also has been observed to exhibit emergent behaviour. The LLMs have been adopted in other data modalities like vision, speech or multi-modal. Very recently, LLMs are being used to develop autonomous agents that can solve complex tasks. This course aims at providing foundational knowledge on key technologies for developing, leveraging and augmenting LLMs.

Syllabus Moodle

Instructor

Plaban Kumar Bhowmick

plaban@ai.iitkgp.ac.in

Logistics

Time: Monday 11:0AM - 11:55AM, Tuesday 8:00AM - 9:55AM

Location: NR-213, Nalanda Classroom Complex

Credits: 3-0-0

Course Teaching Assistants

Animesh

Teaching Assistant (Course Webpage Designer and Maintainer)

Office Hours:

Mondays 3-4 PM

Doubts about Content and Project

Rekha Regar

Teaching Assistant

Office Hours:

Mondays 3-4 PM

Doubts about Assignments

Trishita Mukherjee

Teaching Assistant

Office Hours:

Tuesday 3-4 PM

Doubts about Vlog

Course Objective

Upon completion of the course the students will be able to

choose, integrate and identify strategies for building LLMs
identify strategies and techniques for scaling up LLM development
use and compare prompt engineering techniques to solve complex tasks like reasoning.
evaluate the LLMs with setting up benchmarks
identify potential threats posed by LLMs and apply mitigation techniques like Retrieval Augmented Generation.
implement techniques to align LLM training with human goals and behaviors.
implement autonomous agents (agentic AI) that use LLMs to solve complex tasks.

Grading Policy

The final grade will be calculated based on the following components:

Assignments

Programming assignments and problem sets

30%

Final Project

Research project with implementation and presentation

10%

Vlog

Video presentations and technical demonstrations

15%

Mid-semester Exam

Comprehensive examination covering first half

20%

End-semester Exam

Final comprehensive examination

25%

Additional Notes:

Late submission policy: 10% deduction per day for assignments
Final project includes code, report, and presentation components
Attendance and class participation may contribute bonus points

General Information

NOTE: We believe that ethics and social implications of LLMs are extremely important topics to discuss. As these topics are covered in the 'AI and Ethics' (CS60016) offered in the Dept. of CSE, this course does not include those.

AI60213 - Foundation of Large Language Models

Instructor

Logistics

Course Teaching Assistants

Sections

Course Objective

Pre-Requisite

Grading Policy

Assignments

Final Project

Vlog

Mid-semester Exam

End-semester Exam

Additional Notes:

General Information

Announcements