Adaptive computation modules : granular conditional computation for efficient inference

2025
book section
conference proceedings
dc.abstract.enWhile transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective" width needed to process a token can vary from layer to layer. Motivated by this observation, we introduce the Adaptive Computation Module (ACM), a generic module that dynamically adapts its computational load to match the estimated difficulty of the input on a per-token basis. An ACM consists of a sequence of learners that progressively refine the output of their preceding counterparts. An additional gating mechanism determines the optimal number of learners to execute for each token. We also propose a distillation technique to replace any pre-trained model with an "ACMized" variant. Our evaluation of transformer models in computer vision and speech recognition demonstrates that substituting layers with ACMs significantly reduces inference costs without degrading the downstream accuracy for a wide interval of user-defined budgets
dc.affiliationSzkoła Doktorska Nauk Ścisłych i Przyrodniczych
dc.conference39th AAAI Conference on Artificial Intelligence
dc.conference.cityFiladelfia, Pensylwania
dc.conference.countryStany Zjednoczone
dc.conference.datefinish2025-03-04
dc.conference.datestart2025-02-25
dc.conference.seriesNational Conference of the American Association for Artificial Intelligence
dc.conference.seriesshortcutAAAI
dc.conference.shortcutAAAI-25
dc.conference.weblinkhttps://aaai.org/conference/aaai/aaai-25/
dc.contributor.authorWójcik, Bartosz - 422840
dc.contributor.authorDevoto, Alessio
dc.contributor.authorPustelnik, Karol
dc.contributor.authorMinervini, Pasquale
dc.contributor.authorScardapane, Simone
dc.contributor.editorWalsh, Toby
dc.contributor.editorShah, Julie
dc.contributor.editorKolter, Zico
dc.date.accession2025-05-06
dc.date.accessioned2025-05-06T09:05:19Z
dc.date.available2025-05-06T09:05:19Z
dc.date.createdat2025-04-14T09:14:19Zen
dc.date.issued2025
dc.date.openaccess0
dc.description.accesstimew momencie opublikowania
dc.description.conftypeinternational
dc.description.physical21510-21518
dc.description.seriesProceedings of the AAAI Conference on Artificial Intelligence
dc.description.versionostateczna wersja wydawcy
dc.description.volume39
dc.identifier.bookweblinkhttps://ruj.uj.edu.pl/entities/publication/f3a7e81e-9027-4663-b2ad-4aea425c76c2
dc.identifier.doi10.1609/aaai.v39i20.35453
dc.identifier.isbn978-1-57735-897-8
dc.identifier.serieseissn2374-3468
dc.identifier.seriesissn2159-5399
dc.identifier.urihttps://ruj.uj.edu.pl/handle/item/552028
dc.identifier.weblinkhttps://ojs.aaai.org/index.php/AAAI/article/view/35453
dc.languageeng
dc.language.containereng
dc.placeWashington
dc.publisherAAAI Press
dc.rightsDodaję tylko opis bibliograficzny
dc.rights.licenceInna otwarta licencja
dc.share.typeinne
dc.subtypeConferenceProceedings
dc.titleAdaptive computation modules : granular conditional computation for efficient inference
dc.title.containerProceedings of the 39th AAAI Conference on Artificial Intelligence
dc.title.volumeAAAI-25 Technical Tracks 20
dc.typeBookSection
dspace.entity.typePublicationen
dc.abstract.en
While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective" width needed to process a token can vary from layer to layer. Motivated by this observation, we introduce the Adaptive Computation Module (ACM), a generic module that dynamically adapts its computational load to match the estimated difficulty of the input on a per-token basis. An ACM consists of a sequence of learners that progressively refine the output of their preceding counterparts. An additional gating mechanism determines the optimal number of learners to execute for each token. We also propose a distillation technique to replace any pre-trained model with an "ACMized" variant. Our evaluation of transformer models in computer vision and speech recognition demonstrates that substituting layers with ACMs significantly reduces inference costs without degrading the downstream accuracy for a wide interval of user-defined budgets
dc.affiliation
Szkoła Doktorska Nauk Ścisłych i Przyrodniczych
dc.conference
39th AAAI Conference on Artificial Intelligence
dc.conference.city
Filadelfia, Pensylwania
dc.conference.country
Stany Zjednoczone
dc.conference.datefinish
2025-03-04
dc.conference.datestart
2025-02-25
dc.conference.series
National Conference of the American Association for Artificial Intelligence
dc.conference.seriesshortcut
AAAI
dc.conference.shortcut
AAAI-25
dc.conference.weblink
https://aaai.org/conference/aaai/aaai-25/
dc.contributor.author
Wójcik, Bartosz - 422840
dc.contributor.author
Devoto, Alessio
dc.contributor.author
Pustelnik, Karol
dc.contributor.author
Minervini, Pasquale
dc.contributor.author
Scardapane, Simone
dc.contributor.editor
Walsh, Toby
dc.contributor.editor
Shah, Julie
dc.contributor.editor
Kolter, Zico
dc.date.accession
2025-05-06
dc.date.accessioned
2025-05-06T09:05:19Z
dc.date.available
2025-05-06T09:05:19Z
dc.date.createdaten
2025-04-14T09:14:19Z
dc.date.issued
2025
dc.date.openaccess
0
dc.description.accesstime
w momencie opublikowania
dc.description.conftype
international
dc.description.physical
21510-21518
dc.description.series
Proceedings of the AAAI Conference on Artificial Intelligence
dc.description.version
ostateczna wersja wydawcy
dc.description.volume
39
dc.identifier.bookweblink
https://ruj.uj.edu.pl/entities/publication/f3a7e81e-9027-4663-b2ad-4aea425c76c2
dc.identifier.doi
10.1609/aaai.v39i20.35453
dc.identifier.isbn
978-1-57735-897-8
dc.identifier.serieseissn
2374-3468
dc.identifier.seriesissn
2159-5399
dc.identifier.uri
https://ruj.uj.edu.pl/handle/item/552028
dc.identifier.weblink
https://ojs.aaai.org/index.php/AAAI/article/view/35453
dc.language
eng
dc.language.container
eng
dc.place
Washington
dc.publisher
AAAI Press
dc.rights
Dodaję tylko opis bibliograficzny
dc.rights.licence
Inna otwarta licencja
dc.share.type
inne
dc.subtype
ConferenceProceedings
dc.title
Adaptive computation modules : granular conditional computation for efficient inference
dc.title.container
Proceedings of the 39th AAAI Conference on Artificial Intelligence
dc.title.volume
AAAI-25 Technical Tracks 20
dc.type
BookSection
dspace.entity.typeen
Publication
Affiliations

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views
38
Views per month
Views per city
Krakow
12
Dublin
1

No access

No Thumbnail Available