Last updated 2 years ago
Was this helpful?
insights on the benefits of the Hugging Face Datasets library
discuss the literature review on project tasks
get some ideas on how to visualize sequence data
complete chapter 6 (The Tokenizers Library) of the Hugging Face course
look into the characteristics of you dataset and:
write down the specifics of how your data was collected
create filter variables to group your input data according to special characteristics
consider the following questions:
What are potential biases in your training data?
Are there outliers in the dataset?
Are the classes balanced? (If you deal with a classification task.)