%%capture
from fastai_transformers_utils.all import TransformersTokenizer, TransformersNumericalize, Pad2Max, BertSeqClassificationCallback, roberta_SeqClassification_split

import json
from typing import *
from fastai2.basics import *
from fastai2.text.all import *
from fastai2.callback.all import *

from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification, RobertaForSequenceClassification, RobertaTokenizer
# all_slow
num_class = 2
# model_name = 'bert-base-uncased'
# model_name = 'distilbert-base-uncased'
# model_name = 'albert-base-v2'
model_name = 'roberta-base'
max_len = 150
tokenizer = AutoTokenizer.from_pretrained(model_name)

RobertaTokenizerFast has an issue when working on mask language modeling where it introduces an extra encoded space before the mask token.See https://github.com/huggingface/transformers/pull/2778 for more information.

Data and Tokenization

path = untar_data(URLs.IMDB_SAMPLE)
df = pd.read_csv(path/'texts.csv')
df.head()
label text is_valid
0 negative Un-bleeping-believable! Meg Ryan doesn't even look her usual pert lovable self in this, which normally makes me forgive her shallow ticky acting schtick. Hard to believe she was the producer on this dog. Plus Kevin Kline: what kind of suicide trip has his career been on? Whoosh... Banzai!!! Finally this was directed by the guy who did Big Chill? Must be a replay of Jonestown - hollywood style. Wooofff! False
1 positive This is a extremely well-made film. The acting, script and camera-work are all first-rate. The music is good, too, though it is mostly early in the film, when things are still relatively cheery. There are no really superstars in the cast, though several faces will be familiar. The entire cast does an excellent job with the script.<br /><br />But it is hard to watch, because there is no good end to a situation like the one presented. It is now fashionable to blame the British for setting Hindus and Muslims against each other, and then cruelly separating them into two countries. There is som... False
2 negative Every once in a long while a movie will come along that will be so awful that I feel compelled to warn people. If I labor all my days and I can save but one soul from watching this movie, how great will be my joy.<br /><br />Where to begin my discussion of pain. For starters, there was a musical montage every five minutes. There was no character development. Every character was a stereotype. We had swearing guy, fat guy who eats donuts, goofy foreign guy, etc. The script felt as if it were being written as the movie was being shot. The production value was so incredibly low that it felt li... False
3 positive Name just says it all. I watched this movie with my dad when it came out and having served in Korea he had great admiration for the man. The disappointing thing about this film is that it only concentrate on a short period of the man's life - interestingly enough the man's entire life would have made such an epic bio-pic that it is staggering to imagine the cost for production.<br /><br />Some posters elude to the flawed characteristics about the man, which are cheap shots. The theme of the movie "Duty, Honor, Country" are not just mere words blathered from the lips of a high-brassed offic... False
4 negative This movie succeeds at being one of the most unique movies you've seen. However this comes from the fact that you can't make heads or tails of this mess. It almost seems as a series of challenges set up to determine whether or not you are willing to walk out of the movie and give up the money you just paid. If you don't want to feel slighted you'll sit through this horrible film and develop a real sense of pity for the actors involved, they've all seen better days, but then you realize they actually got paid quite a bit of money to do this and you'll lose pity for them just like you've alr... False
tok_list = L(parallel_gen(TransformersTokenizer, df.text, tokenizer=tokenizer)
            ).sorted().itemgot(1)
tok_df = df.copy()
tok_df.text =  tok_list.map(lambda x: ' '.join(x)) # split tokens by ' '
tok_df.head()
label text is_valid
0 negative <s> Un - ble eping - bel iev able ! ĠMeg ĠRyan Ġdoesn 't Ġeven Ġlook Ġher Ġusual Ġpert Ġl ovable Ġself Ġin Ġthis , Ġwhich Ġnormally Ġmakes Ġme Ġforgive Ġher Ġshallow Ġtick y Ġacting Ġsch tick . ĠHard Ġto Ġbelieve Ġshe Ġwas Ġthe Ġproducer Ġon Ġthis Ġdog . ĠPlus ĠKevin ĠK line : Ġwhat Ġkind Ġof Ġsuicide Ġtrip Ġhas Ġhis Ġcareer Ġbeen Ġon ? ĠWho osh ... ĠBan zai !!! ĠFinally Ġthis Ġwas Ġdirected Ġby Ġthe Ġguy Ġwho Ġdid ĠBig ĠChill ? ĠMust Ġbe Ġa Ġreplay Ġof ĠJon est own Ġ- Ġh ollywood Ġstyle . ĠWoo off f ! </s> False
1 positive <s> This Ġis Ġa Ġextremely Ġwell - made Ġfilm . ĠThe Ġacting , Ġscript Ġand Ġcamera - work Ġare Ġall Ġfirst - rate . ĠThe Ġmusic Ġis Ġgood , Ġtoo , Ġthough Ġit Ġis Ġmostly Ġearly Ġin Ġthe Ġfilm , Ġwhen Ġthings Ġare Ġstill Ġrelatively Ġche ery . ĠThere Ġare Ġno Ġreally Ġsuperst ars Ġin Ġthe Ġcast , Ġthough Ġseveral Ġfaces Ġwill Ġbe Ġfamiliar . ĠThe Ġentire Ġcast Ġdoes Ġan Ġexcellent Ġjob Ġwith Ġthe Ġscript .< br Ġ/ >< br Ġ/> But Ġit Ġis Ġhard Ġto Ġwatch , Ġbecause Ġthere Ġis Ġno Ġgood Ġend Ġto Ġa Ġsituation Ġlike Ġthe Ġone Ġpresented . ĠIt Ġis Ġnow Ġfashionable Ġto Ġblame Ġthe ĠBritish Ġfor... False
2 negative <s> Every Ġonce Ġin Ġa Ġlong Ġwhile Ġa Ġmovie Ġwill Ġcome Ġalong Ġthat Ġwill Ġbe Ġso Ġawful Ġthat ĠI Ġfeel Ġcompelled Ġto Ġwarn Ġpeople . ĠIf ĠI Ġlabor Ġall Ġmy Ġdays Ġand ĠI Ġcan Ġsave Ġbut Ġone Ġsoul Ġfrom Ġwatching Ġthis Ġmovie , Ġhow Ġgreat Ġwill Ġbe Ġmy Ġjoy .< br Ġ/ >< br Ġ/> Where Ġto Ġbegin Ġmy Ġdiscussion Ġof Ġpain . ĠFor Ġstarters , Ġthere Ġwas Ġa Ġmusical Ġmont age Ġevery Ġfive Ġminutes . ĠThere Ġwas Ġno Ġcharacter Ġdevelopment . ĠEvery Ġcharacter Ġwas Ġa Ġstereotype . ĠWe Ġhad Ġswearing Ġguy , Ġfat Ġguy Ġwho Ġeats Ġdon uts , Ġgoofy Ġforeign Ġguy , Ġetc . ĠThe Ġscript Ġfelt Ġas ... False
3 positive <s> Name Ġjust Ġsays Ġit Ġall . ĠI Ġwatched Ġthis Ġmovie Ġwith Ġmy Ġdad Ġwhen Ġit Ġcame Ġout Ġand Ġhaving Ġserved Ġin ĠKorea Ġhe Ġhad Ġgreat Ġadmiration Ġfor Ġthe Ġman . ĠThe Ġdisappointing Ġthing Ġabout Ġthis Ġfilm Ġis Ġthat Ġit Ġonly Ġconcentrate Ġon Ġa Ġshort Ġperiod Ġof Ġthe Ġman 's Ġlife Ġ- Ġinterestingly Ġenough Ġthe Ġman 's Ġentire Ġlife Ġwould Ġhave Ġmade Ġsuch Ġan Ġepic Ġbio - pic Ġthat Ġit Ġis Ġstaggering Ġto Ġimagine Ġthe Ġcost Ġfor Ġproduction .< br Ġ/ >< br Ġ/> Some Ġposters Ġel ude Ġto Ġthe Ġflawed Ġcharacteristics Ġabout Ġthe Ġman , Ġwhich Ġare Ġcheap Ġshots . ĠThe Ġtheme Ġo... False
4 negative <s> This Ġmovie Ġsucceeds Ġat Ġbeing Ġone Ġof Ġthe Ġmost Ġunique Ġmovies Ġyou 've Ġseen . ĠHowever Ġthis Ġcomes Ġfrom Ġthe Ġfact Ġthat Ġyou Ġcan 't Ġmake Ġheads Ġor Ġtails Ġof Ġthis Ġmess . ĠIt Ġalmost Ġseems Ġas Ġa Ġseries Ġof Ġchallenges Ġset Ġup Ġto Ġdetermine Ġwhether Ġor Ġnot Ġyou Ġare Ġwilling Ġto Ġwalk Ġout Ġof Ġthe Ġmovie Ġand Ġgive Ġup Ġthe Ġmoney Ġyou Ġjust Ġpaid . ĠIf Ġyou Ġdon 't Ġwant Ġto Ġfeel Ġslight ed Ġyou 'll Ġsit Ġthrough Ġthis Ġhorrible Ġfilm Ġand Ġdevelop Ġa Ġreal Ġsense Ġof Ġpity Ġfor Ġthe Ġactors Ġinvolved , Ġthey 've Ġall Ġseen Ġbetter Ġdays , Ġbut Ġthen Ġyou Ġreali... False
# tok_df.to_csv(f'{model_name}_tok.csv', index=False)
# tok_df = pd.read_csv(f'{model_name}_tok.csv')

DataLoaders

splits = ColSplitter()(tok_df)
ds_tfms = [
    [attrgetter('text'), lambda x: x.split(' '), TransformersNumericalize(tokenizer), Pad2Max(max_len, tokenizer.pad_token_id)], 
    [attrgetter("label"), Categorize()]
]
dss = Datasets(tok_df, tfms=ds_tfms, splits=splits)
dss.train[0], dss.decode(dss.train[0])
((TensorText([    0,  9685,    12,  5225, 24320,    12,  8494, 18421,   868,   328,
          14938,  1774,   630,    75,   190,   356,    69,  4505, 32819,   784,
          30289,  1403,    11,    42,     6,    61,  6329,   817,   162, 20184,
             69, 16762, 10457,   219,  3501,  8447, 41791,     4,  6206,     7,
            679,    79,    21,     5,  3436,    15,    42,  2335,     4,  4642,
           2363,   229,  1902,    35,    99,   761,     9,  4260,  1805,    34,
             39,   756,    57,    15,   116,  3394,  5212,   734,  5981, 23642,
          16506,  3347,    42,    21,  3660,    30,     5,  2173,    54,   222,
           1776, 25928,   116,  8495,    28,    10, 16462,     9,  4160,   990,
           3355,   111,  1368,  9718,  2496,     4, 29935,  1529,   506,   328,
              2,     1,     1,     1,     1,     1,     1,     1,     1,     1,
              1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
              1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
              1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
              1,     1,     1,     1,     1,     1,     1,     1,     1,     1]),
  TensorCategory(0)),
 ("<s>Un-bleeping-believable! Meg Ryan doesn't even look her usual pert lovable self in this, which normally makes me forgive her shallow ticky acting schtick. Hard to believe she was the producer on this dog. Plus Kevin Kline: what kind of suicide trip has his career been on? Whoosh... Banzai!!! Finally this was directed by the guy who did Big Chill? Must be a replay of Jonestown - hollywood style. Wooofff!</s>",
  'negative'))
def get_dataloaders(dss, bs: int):
    return dss.dataloaders(bs=bs)

dls = get_dataloaders(dss, bs=2)
# for x in dls.train:
#     print(x[0].shape, x[0].dtype, x[0].device)
#     print(x[1].shape, x[1].dtype, x[1].device)
#     break
torch.Size([2, 150]) torch.int64 cuda:0
torch.Size([2]) torch.int64 cuda:0
dls.show_batch(max_n=2)
text category
0 <s>This was another obscure Christmas-related title, a low-budget Mexican production from exploitation film-maker Cardona (NIGHT OF THE BLOODY APES [1969], TINTORERA! [1977]), which – like many a genre effort from this country – was acquired for release in the U.S. by K. Gordon Murray. Judging by those two efforts already mentioned, Cardona was no visionary – and, this one having already received its share of flak over here, is certainly no better! The film, in fact, is quite redolent of the weirdness which characterized Mexican horror outings from the era, but given an added dimension by virtue of the garish color (which, in view of negative
1 <s>This movie was absolutely wonderful. The pre-partition time and culture has been recreated beautifully. Urmila has given yet another brilliant performance. What I truly admire about this movie is that it doesn't resort to Pakistan-bashing that is running rampant in movies like Gadar and LOC. With the partition as a backdrop, the movie does not divert to political issues or focus on violence or what is right and wrong. The movie always centers around the tragic story of Urmila's life. Her fragile relationship with Manoj Bajpai has been depicted excellently. The movie actually shows how the people, both Hindus and Muslims, have suffered from this partition. The theme that there is only one religion is truly prevalent in this positive

Learner and Train

dbunch = get_dataloaders(dss, bs=64)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.config.num_labels=num_class
learn = Learner(dls, 
                model, 
                loss_func=CrossEntropyLossFlat(), 
                opt_func=ranger,
                splitter=roberta_SeqClassification_split, 
                cbs=[BertSeqClassificationCallback(tokenizer.pad_token_id)],
                metrics=[accuracy],
               )

learn.freeze_to(-1)
learn.summary()
RobertaForSequenceClassification (Input shape: ['2 x 150'])
================================================================
Layer (type)         Output Shape         Param #    Trainable 
================================================================
Embedding            2 x 150 x 768        38,603,520 False     
________________________________________________________________
Embedding            2 x 150 x 768        394,752    False     
________________________________________________________________
Embedding            2 x 150 x 768        768        False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
Dropout              2 x 12 x 150 x 150   0          False     
________________________________________________________________
Linear               2 x 150 x 768        590,592    False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 150 x 3072       2,362,368  False     
________________________________________________________________
Linear               2 x 150 x 768        2,360,064  False     
________________________________________________________________
LayerNorm            2 x 150 x 768        1,536      False     
________________________________________________________________
Dropout              2 x 150 x 768        0          False     
________________________________________________________________
Linear               2 x 768              590,592    False     
________________________________________________________________
Tanh                 2 x 768              0          False     
________________________________________________________________
Linear               2 x 768              590,592    True      
________________________________________________________________
Dropout              2 x 768              0          False     
________________________________________________________________
Linear               2 x 2                1,538      True      
________________________________________________________________

Total params: 125,237,762
Total trainable params: 592,130
Total non-trainable params: 124,645,632

Optimizer used: <function ranger at 0x7fb9c19522f0>
Loss function: FlattenedLoss of CrossEntropyLoss()

Model frozen up to parameter group number 14

Callbacks:
  - TrainEvalCallback
  - Recorder
  - ProgressCallback
  - BertSeqClassificationCallback
# learn.lr_find()
learn.fit_one_cycle(1, 1e-2, moms=(0.8,0.7,0.8))
epoch train_loss valid_loss accuracy time
0 0.668175 0.650802 0.605000 00:17
# learn.fit_one_cycle(5, 1e-2, moms=(0.8,0.7,0.8))
# learn.freeze_to(-3)
# learn.fit_one_cycle(5, 1e-3, moms=(0.8,0.7,0.8))
learn.show_results()
text category category_
0 <s>There is no relation at all between Fortier and Profiler but the fact that both are police series about violent crimes. Profiler looks crispy, Fortier looks classic. Profiler plots are quite simple. Fortier's plot are far more complicated... Fortier looks more like Prime Suspect, if we have to spot similarities... The main character is weak and weirdo, but have "clairvoyance". People like to compare, to judge, to evaluate. How about just enjoying? Funny thing too, people writing Fortier looks American but, on the other hand, arguing they prefer American series (!!!). Maybe it's the language, or the spirit, but I think this series is more English than American. By the way, the positive negative
1 <s>Punctuating the opening credits sequence is a swarthy man having a strange, all-too-real nightmare. Closing in on its dystopic 2054 Paris, the film begins to follow a woman into a grungy club, where she and a Slavic bartender convene outside on the deck. They toss exclamations at each other to the effect that she owes him more money although she believes she's paid it all. Another woman obstructs the budding violence, only to have a bitter fight with the woman herself. The initial woman storms out, and she is kidnapped. Christian Volckman's Renaissance appears to be another one in an assembly line of recent motion-capture-animated sci-fi noir pictures, but positive positive

Export

# # fastai model
# learn.save(f'{model_name}_final')
# # transformeres model
# Path('./models/transformers_model').mkdir(exist_ok=True)
# learn.model.save_pretrained('./models/transformers_model')
# Path('./models/tokenizer').mkdir(exist_ok=True)
# tokenizer.save_pretrained('./models/tokenizer')
# transform_info = {'category_map': list(learn.dbunch.vocab), 'max_len': max_len}
# transform_info = json.dumps(transform_info)
# Path('./models/transform_info.json').write_text(transform_info)

Inference

# inference_model = RobertaForSequenceClassification.from_pretrained('./models/transformers_model')
# inference_model.cuda()
# inference_model.eval()
# dbch = get_databunch(dsrc, 64)
# for x, y in dls.train:
#     pred, attention = inference_model(x)
#     pred = pred.argmax(-1)
#     print((pred == y).float().mean())
#     break

Client inference

# inference_model = RobertaForSequenceClassification.from_pretrained('./models/transformers_model')
# model.config.output_attentions = True
# inference_model.eval()
# tokenizer = RobertaTokenizer.from_pretrained('./models/tokenizer')
# transform_info = json.loads(Path('./models/transform_info.json').read_text())
# category_map, max_len = transform_info['category_map'], transform_info['max_len']
# test_sentences = ['What a suck movie!!!', 'Feels like good. Nice movie.']
# with torch.no_grad():
#     for sentence in test_sentences:
#         numeric_sentence = tokenizer.encode(sentence)
#         x = torch.tensor([numeric_sentence])
#         pred, attention = inference_model(x)
#         pred = pred.argmax(-1)
#         print(category_map[pred])
        
# #         attention = attention[-1][0][-1][0]
# #         tok_sentence = tokenizer.convert_ids_to_tokens(numeric_sentence)
# #         print(tok_sentence, attention)