Tahmid Hasan

Tahmid

Hi! I am an assistant professor at the Department of Computer Science and Engineering from Bangladesh University of Engineering and Technology (BUET) and am affiliated with the BUET CSE NLP Group. I got my Masters degree from the same department and was fortunate to be supervised by Prof. Rifat Shahriyar.

Until recently, my research interests were in low-resource, multilingual, and cross-lingual natural language processing. I particularly focused on efficient utilization of compute and data in scenarios where one or both were limited.

Over the last four years, I worked on multiple projects on machine translation, text summarization, natural language understanding and generation, and NLP for programming languages and software engineering.

Prior to that, I got my Bachelors degree also from CSE, BUET and worked on Bioinformatics and Networking supervised by Prof. M. Sohel Rahman.

Email: tahmidhasan [at] cse.buet.ac.bd

Links: [Curriculum Vitae] [Google Scholar] [Twitter] [GitHub]


Publications

  1. XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages
    Tahmid Hasan, Abhik Bhattacharjee, Md. Saiful Islam, Kazi Mubasshir, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, Rifat Shahriyar
    In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
    [bib] [abstract] [code]

  2. CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs
    Abhik Bhattacharjee*, Tahmid Hasan*, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, Rifat Shahriyar
    *: Equal contribution
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
    [bib] [abstract] [code]

  3. Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
    Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin, Masum Hasan, Madhusudan Basak, M. Sohel Rahman, Rifat Shahriyar
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
    [bib] [abstract] [code]

  4. BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla
    Abhik Bhattacharjee*, Tahmid Hasan*, Wasi Uddin Ahmad, Kazi Samin, Md Saiful Islam, M. Sohel Rahman, Anindya Iqbal, Rifat Shahriyar
    *: Equal contribution
    In Findings of the North American Chapter of the Association for Computational Linguistics: NAACL 2022
    [bib] [abstract] [code]

  5. BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla
    Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Rifat Shahriyar
    In Findings of the Association for Computational Linguistics: EACL 2023
    [bib] [abstract] [code]

  6. Contrastive Learning for API Aspect Analysis
    GM Shahariar, Tahmid Hasan, Anindya Iqbal, Gias Uddin
    In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)
    [bib] [abstract] [code]

  7. CoDesc: A Large Code–Description Parallel Dataset
    Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Ahmad, Anindya Iqbal, Rifat Shahriyar
    In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
    [bib] [abstract] [code]

  8. GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
    Sebastian Gehrmann, ... Tahmid Hasan, ...
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
    [bib] [abstract] [code]

  9. Using Adaptive Heartbeat Rate on Long-Lived TCP Connections
    M. Saifur Rahman, Md. Yusuf Sarwar Uddin, Tahmid Hasan, M. Sohel Rahman, M. Kaykobad
    In IEEE/ACM Transactions on Networking (Volume: 26, Issue: 1, Feb. 2018)
    [bib] [abstract] [code]

Pre-prints:
  1. BERT2Code: Can Pretrained Language Models be Leveraged for Code Search?
    Abdullah Al Ishtiaq, Masum Hasan, Md. Mahim Anjum Haque, Kazi Sajeed Mehrab, Tanveer Muttaqueen, Tahmid Hasan, Anindya Iqbal, Rifat Shahriyar
    ArXiv Pre-print, 2021
    [bib] [abstract]


Ongoing Projects


  1. Aligning Large Language Models with Human Values [statement]
    Supervisor: Prof. Rifat Shahriyar

  2. Retrieval-Augmented Large Language Models [statement]
    Supervisor: Prof. Rifat Shahriyar

  3. Vision-Enhanced Large Language Models [statement]
    Supervisor: Prof. Rifat Shahriyar


Education

  • (June 2019 - May 2022) Master of Science in Computer Science and Engineering
    Department of Computer Science and Engineering
    Bangladesh University of Engineering and Technology (BUET)
    Supervisor: Prof. Rifat Shahriyar
    CGPA: 3.92/4.00

  • (February 2015 - April 2019) Bachelor of Science in Computer Science and Engineering
    Department of Computer Science and Engineering
    Bangladesh University of Engineering and Technology (BUET)
    CGPA: 3.98/4.00
    Thesis Supervisor: Prof. M. Sohel Rahman

Professional Experience

  • (July 2022 - Present) Assistant Professor
    Department of Computer Science and Engineering
    Bangladesh University of Engineering and Technology (BUET)

  • (October 2019 - July 2022) Lecturer
    Department of Computer Science and Engineering
    Bangladesh University of Engineering and Technology (BUET)

  • (June 2022 - Present) Research Affiliate
    BUET CSE NLP Group
    Supervisor: Prof. Rifat Shahriyar

  • (June 2019 - May 2022) Graduate Research Assistant
    Department of Computer Science and Engineering
    Bangladesh University of Engineering and Technology (BUET)
    Supervisor: Prof. Rifat Shahriyar

Honors & Awards

  • (2019 - 2022) University Merit Scholarships in each semester for excellent postgraduate results
  • (2015 - 2019) Dean's Award in each academic year for excellent undergraduate results
  • (2015 - 2019) University Merit Scholarships in each semester for excellent undergraduate results

Teaching Experience

  • (January 2023) Course Instructor @ Mathematical Analysis for Computer Science (CSE 301)
  • (January 2023) Course Instructor @ Information System Design Sessional (CSE 326)
  • (January 2023) Course Instructor @ Software Development Sessional (CSE 408)
  • (July 2022) Course Instructor @ Mathematical Analysis for Computer Science (CSE 301)
  • (July 2022) Course Instructor @ Numerical Methods Sessional (CSE 218)
  • (July 2022) Course Instructor @ Operating Systems Sessional (CSE 314)
  • (January 2022) Course Instructor @ Computer Architecture (CSE 305)
  • (January 2022) Course Instructor @ Computer Architecture Sessional (CSE 306)
  • (January 2022) Course Instructor @ Software Development (CSE 408)
  • (January 2022) Course Instructor @ Object Oriented Programming Language Sessional (CSE 108)
  • (July 2021) Course Instructor @ Machine Learning (CSE 471)
  • (July 2021) Course Instructor @ Data Structures and Algorithms I Sessional (CSE 204)
  • (July 2021) Course Instructor @ Database Sessional (CSE 216)
  • (July 2021) Course Instructor @ Numerical Methods Sessional (CSE 218)
  • (July 2021) Course Instructor @ Machine Learning Sessional (CSE 472)
  • (January 2021) Course Instructor @ Object Oriented Programming Language Sessional (CSE 108)
  • (January 2021) Course Instructor @ Digital Logic Design Sessional (CSE 206)
  • (January 2021) Course Instructor @ Software Engineering Sessional (CSE 308)
  • (January 2021) Course Instructor @ Microprocessors, Microcontrollers, and Embedded Systems Sessional (CSE 316)
  • (January 2020) Course Instructor @ Data Structures and Algorithms II Sessional (CSE 208)
  • (January 2020) Course Instructor @ Computer Networks Sessional (CSE 322)
  • (January 2020) Course Instructor @ Simulation and Modeling Sessional (CSE 412)
  • (January 2020) Course Instructor @ Algorithm Engineering Sessional (CSE 462)
  • (January 2020) Course Instructor @ Machine Learning Sessional (CSE 472)
  • (July 2019) Course Instructor @ Data Structures and Algorithms I Sessional (CSE 204)
  • (July 2019) Course Instructor @ Numerical Methods (CSE 218)
  • (July 2019) Course Instructor @ Computer Programming Techniques Sessional (CSE 296)
  • (July 2019) Course Instructor @ Software Development (CSE 408)