腾讯发布万亿大模型训练方法:最快256卡1天训完万亿NLP大模型( 六 )


[10]ERNIE3.0Titan:ExploringLarger-scaleKnowledgeEnhancedPre-trainingforLanguageUnderstandingandGenerationhttps://arxiv.org/abs/2112.12731
[11]PaLM:ScalingLanguageModelingwithPathwayshttps://arxiv.org/abs/2204.02311
[12]GLaM:EfficientScalingofLanguageModelswithMixture-of-Expertshttps://arxiv.org/abs/2112.06905
[13]TrainLarge,ThenCompress:RethinkingModelSizeforEfficientTrainingandInferenceofTransformershttps://arxiv.org/abs/2002.11794
[14]AReviewofSparseExpertModelsinDeepLearninghttps://arxiv.org/abs/2209.01667
[15]RoFormer:EnhancedTransformerwithRotaryPositionEmbeddinghttps://arxiv.org/abs/2104.09864
[16]Talking-HeadsAttentionhttps://arxiv.org/abs/2003.02436
[17]GLUVariantsImproveTransformerhttps://arxiv.org/abs/2002.05202
[18]腾讯AILab发布智能创作助手「文涌(Effidit)」 , 用技术助力「文思泉涌」https://mp.weixin.qq.com/s/b-kPSR3aFPKHpUnFv7gmeA
[19]腾讯“混元”AI大模型登顶CLUE三大榜单 , 打破多项行业记录http://ex.chinadaily.com.cn/exchange/partners/82/rss/channel/cn/columns/snl9a7/stories/WS628df605a3101c3ee7ad730e.html
—完—
量子位QbitAI·头条号签约