JUPYTER NOTEBOOKS

What is Jupyter Notebook?

Jupyter Notebook is an application (a program) that is used to execute programming code. The notebook consist of cells. Thus, your code may easily be sliced into pieces with explanatory text, equations, figures or what you need.

There are two types of cells:

  • code cells
  • text cells (with markup text, i..e text which is formatted)

The Jupyter notebooks listed below provide an introduction to Python, with a focus on text processing and the analysis of datasets in linguistics and language studies. It is addressed to anyone who is generally familiar with computers and who has an elementary knowledge of linguistics and math. Some of the notebooks use digital language resources from CLARINO and other sources. The notebooks are meant to be studied in order.

The notebooks are currently stored on on Google Colaboratory, where you can run them remotely if you have a web browser and a Google account. Please copy each notebook to your own Google Drive (File > Save a copy in Drive) before running and editing your own copy.

Alternatively, you can download the notebooks and run them locally on your own machine using a suitable application such as Visual Studio Code, or you can upload them to another online service such as Kaggle Links to an external site.Deepnote Links to an external site. or Binder Links to an external site., and run them there.

Get started!

  1. First steps with Jupyter Notebook: Python as a calculator Links to an external site.
  2. Strings for representing text Links to an external site.
  3. Common beginner’s errors Links to an external site.

    Strings and writing systems

  4. String operations Links to an external site.
  5. Writing systems Links to an external site.

    Sequences and sets

  6. Lists Links to an external site.
  7. Tuples Links to an external site.
  8. Sets Links to an external site.

    Functions

  9. Function definitions Links to an external site.
  10. Parameters in functions Links to an external site.
  11. Local variables in functions Links to an external site.

    Control structures

  12. Conditions with if Links to an external site.
  13. Iteration with for and while Links to an external site.
  14. Comprehensions Links to an external site.
  15. Iterators and generator expressions Links to an external site.

    Attribute-value data

  16. Dict Links to an external site.

    Input and output

  17. Formatted output Links to an external site.
  18. Interactive input Links to an external site.

    Ranges and slicing

  19. Ranges Links to an external site.
  20. Slicing with step Links to an external site.
  21. Palindromes and retrograde sorting Links to an external site.

    Regular expressions

  22. Regex search Links to an external site.
  23. Regex search continued Links to an external site.
  24. Regex substitution and split Links to an external site.

    Word tokenization and frequencies

  25. A simple word tokenizer Links to an external site.
  26. Tokenization and frequencies with NLTK Links to an external site.
  27. Counters and plotting Links to an external site.

    Zips and n-grams

  28. Zips Links to an external site.
  29. N-grams Links to an external site.

    Accessing text in files and from the web

  30. Writing and reading files Links to an external site.
  31. Accessing Google Drive Links to an external site.
  32. Reading plain text from the web Links to an external site.
  33. Extracting text from HTML web pages Links to an external site.

    Tabular data types

  34. Arrays and dataframes Links to an external site.

    Datasets from web sources

  35. Dataframe from CSV on the web: sorting, matching and counting Links to an external site.
  36. Groups and visualization with Anscombe's quartet Links to an external site. (optional)
  37. Summing values in groups Links to an external site.
  38. Accessing data on the web through APIs Links to an external site.
  39. Making a CSV data file or dataframe from text lines Links to an external site. (optional)

    Workflows with dataframes

  40. Addressing rows and columns in a dataframe to make a dict Links to an external site.
  41. Combining CSV and API to make a dict Links to an external site.
  42. From dict to dataframe to formatted table Links to an external site.
  43. Workflow with corpus data normalization, table and plot Links to an external site.

    Recursive functions

  44. Recursive functions and assert Links to an external site.
  45. Levenshtein distance as a recursive function Links to an external site.
  46. Finite state automata Links to an external site. (optional)

If you want to know more details about how something in Python works, you can look it up in the online Python documentation Links to an external site..