This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Languages and information technologies

Today, information technologies make a wide range of tools and resources available to all those who work in languages. In the course of your studies, you will find yourself using computers and online tools more and more extensively throughout your writing and research. Here, you can explore further how you can use these tools in many areas of your work, and in the writing that you yourself will produce.

This site aims to document some of the ways in which you can use these resources productively. This documentation is divided into eight main sections. You can read through all of these in sequence, or concentrate on individual parts as the need arises. You will also find template files that you can use in your own written work.

Encountering information technology

In your studies, you will have access to a wide range of resources and facilities, and you will make use of digital tools in availing of them. You will be able to do so using open-access computers on campus, as well as equipment that is at your own personal disposal and which you may choose to bring on campus with you or use to work remotely.

Beyond these resources, the internet will allow you to explore a vast range of collections and applications that will have a bearing on all aspects of your work — spanning research, collection of data and materials, reading, writing.

Your work in languages

You are already likely to be an experienced and adept user of information technologies. The guidance on this site is designed to illustrate how tools that you may already be familiar with can be used to new ends and in new ways. Over the course of your studies, you will also discover new applications and online resources, many of them specific to languages.

Here, you can begin to find out more about all of these many kinds of information technology. You will discover that a given kind of work can be made possible or supported by more than one tool. One of the ways in which you will progress as an independent and autonomous learner is by working out which of these many options support you best in your work. This site aims to help to guide you in this work and it it seeks to anticipate some of the issues that you are likely to want to think about in using information technologies, whether you are using the facilities on campus or you have the opportunity to work with a device of your own.

 

The documentation has been developed to support students in the BA in World Languages in University College Cork. This programme is taught by staff in the School of Languages, Literatures and Cultures, the School of Irish Learning, the Department of Classics and the Language Centre.

1 - Learning with information technologies

Information technologies can support your work in a wide range of areas of learning

1.1 - What computers can do

The term “computer” embraces many devices today, all of which can play productive roles in your work

At their simplest, computers are devices that can store or access data, on which you can perform a range of operations — for instance, a text file in which you can search for occurrences of a given word, a word-processing application in which you work on an essay, an academic source that you can read and annotate, an application or web service with which you can interact in the process of learning, or a dedicated search engine in which you can explore existing knowledge in subjects of interest to you.

These are all operations you can perform on a computer, but also, say, on a smartphone or a tablet.

A word-processing file: data and operations

A word-processing file: data and operations

So, to take another example, a word-processing file contains data in the form of content and a computer application allows you to perform various operations on it. You can save the content as a file, in other words, store all of the data on your hard disk — or on a remote server “in the cloud”. You can print the file if you have a printer attached as a peripheral, or by sending data over the internet to a connected printer. Word-processing applications make use of a graphical interface, which means that you can click on an icon to select bold or italic fonts, for example. Given that much of your work will consist of writing essays, you have everything to gain from becoming proficient in the use of these tools. And of course computers are fully equipped to carry out numerical calculations, like counting the number of words in a file or document.

Word count using the command line

Word count using the command line

In its earlier realizations, a computer operated on the basis of typed instructions via the command line, rather than through a graphical interface. Here, a simple program is invoked using the command wc, or “word count”, and applied to data in the form of a markdown file. By default, the program counts the lines and characters, as well as the number of words, though you can also limit it to a word count only.

This is perhaps the simplest operation that you can carry out working on text using a computer. A computer will also help to you sort all of the words contained in a text in alphabetical order. It can also be used to retrieve lemmas, or to give the dictionary form of each word extracted from a text, e.g. ‘to do’ from ‘done’. It can count and display occurrences of individual words, or tokens, in a given text.

Word tokens in a word cloud (source: Voyant Tools)

Word tokens in a word cloud (source: Voyant Tools)

A web application like Voyant Tools can perform all of these operations, providing, as here, a word cloud displaying word-frequencies in the poems of Emily Dickinson.

The computer as a device for transforming information

Today, we think of computers and similar devices as tools for studying how we can handle and transform information. In other words, the handling and processing of information are as signifcant as the devices themselves through which we study them.

Processing information: translation (source: DeepL)

Processing information: the case of translation (source: DeepL)

A major area of development in information sciences is natural language processing and in turn automated translation. Here, a translation device operates on the basis of determining the likely probability of a sequence of words in one language being a suitable rendering of a given sequence in another — something that the computer application in equipped to do by drawing on a large volume of relevant linguistic data.

So, a computer can be thought of as a layered interactive device: you interact with it by entering a query and then the application in question interacts with the large volume of data available to it in order to provide a response.

Automated translation: alternatives in the database

Automated translation: alternatives for ‘dispositivo’ in the database

Computer operations

Originally, computers, as the term implies, were designed to handle computations.

Using a computer to calculate

Using a computer to calculate

Needless to say, computers continue to be used in this way. Moreover, computers use numbers to store and handle data — for instance, the characters that make up a written script as displayed on a computer screen are encoded using numbers, specifically hexadecimal notation with a base of 16, that is, 0 to 9 and A to F.

Latin Script encoded using hexadecimal numbers

Latin Script encoded using hexadecimal numbers

The upper case A character has a Unicode value of U+0041.

Now, as we have just seen, we think of computers also as devices to store and transform data, whether locally (on a disk) or remotely (in “the cloud”). A computer contains dedicated programs, or applications, which allow you, say, to create documents (whether word-processed files or PDFs), for instance, or to analyse large collections of data, using, for instance, applications like concordance tools.

More and more, dedicated computer applications can help you in your research, e.g. by using web browsers to search the internet or to access specialized collections of data held in a library.

Searching for information: aided by artificial intelligence

Searching for information: aided by artificial intelligence

There is an active dimension to this process: when you begin to enter text, the web application itself mobilizes relevant information in order to anticipate the completion of your query. This is an instance of artificial intelligence in use: the search interface in question can draw on the very large amount of data generated by previous searches to predict with a higher degree of probability the exact scope of your search.

On the other hand, when you write an essay using a word-processor, you are creating data of your own, as well as accessing data from other sources that you may require, using search strategies like these. When you make reference to the sources that you find, you in turn can include links to remote online locations where they can in turn be accessed by your readers.

In other words, your work can trigger further interactions with information on the part of others using computers.

A reference with a hyperlink to an online source

A reference with a hyperlink to an online source

A traditional computer allows you to carry out these interactions via peripherals, like a keyboard. But today you can use a smartphone as a computer, in which case you carry out operations with it via a virtual keyboard.

Using a smartphone to search a library catalogue via the web

Using a smartphone to search a library catalogue via the web

The computer as archive

At the same time, a computer can be understood as an archive or database, containing information and data that you yourself may have assembled. These data can be materials that you create, like your notes or essays, or collections of materials relevant to your work, which you can catalogue and organize using further dedicated applications.

Using Jabref as a bibliographical archive

Using Jabref as a bibliographical archive

It’s a good idea to aim to be organized in how you store information on your own devices, making effective use of the hierarchical structure of computer file management.

The hierarchical directory system: the case of macOS

The hierarchical directory system: the Documents folder in macOS

For instance, you could create a directory for each of your courses, with sub-directories where relevant to organize your material. Local directories on your own machine or remote ones in the cloud all have the same structure and so can be organized in the same way.

Computers have now also become much better equipped to carry out searches among your files and folders. In other words, you don’t always have to explore the entire tree structure of all of your directories to retrieve an item.

How to search: using Spotlight in macOS

How to search: activating Spotlight in macOS

In macOS, a single interface can combine a wide range of searches.

The results of a search

The results of a search: local and remote

The results in turn provide you not only with information on the contents of your files, but also point you in the direction of relevant internet sources.

How does a computer operate?

An operating system (or OS) is a system software that allows applications to interact with the computer’s hardware, for example, by allocating the computer’s resources and memory to the different applications that a user may invoke. The operating system also controls the device’s display and it functions independently of you as user, mobilizing the different components of the machine in response to your commands.

Operating systems are pervasive nowadays, extending to computers (e.g. Windows, macOS), smartphones (Android, iOS) and indeed televisions (tvOS, WebOS, AndroidTV).

The operating system at work

The operating system at work: the case of a computer

Any computer runs a number of software processes via the central processing unit, or CPU, at any one time. The CPU is thus a key component of the computer’s hardware. A computer uses its system memory, or RAM, in order to function, e.g. to make data available to the programs or applications that you may install on your machine while these are in use. A web browser (in this case, Firefox) typically makes heavy demands of this system of primary memory. A computer uses a different, more permanent, form of memory to store information on disks. And finally, most computers today have open net connections that allow you to access remote data via the internet and the World Wide Web. The operating system controls all of these processes and their interactions.

System monitoring: Task Manager in Windows

System monitoring: Task Manager in Windows

Today, computers can be complex machines designed to fulfil a wide range of uses. Operating systems are designed to facilitate your interactions with a device, and with all of the information and processes on which you may draw in your work. For this reason, it makes sense to explore and learn just how Windows or macOS or Chrome OS operates and how each one can help you in your use of the different computers you are likely to encounter.

Text and hypertext: networks and the web

The web forms part of the internet, or the network of all connected networks. When you go online, for example via WiFi, you connect to a local area network. This network is in turn connected to innumerable other networks via specialized gateways and servers.

A webpage is located in a web domain, in other words a computer or group of computers that form a local area network of their own, and are likewise connected through a web server to the Internet. To connect to such a page, you activate, as you well know, a hyperlink which contains the web address of the domain and page in question.

Activating a hyperlink

Activating a hyperlink

Hyperlinks are thus the building blocks of the web. Here, you can see that the user has activated a given link by hovering over it with a mouse and that the hyperlink itself then appears in the bottom left-hand corner of the web browser (in this case, Firefox).

Your web browser will then retrieve and display the page. All of this information travels over the internet in the form of packets, using dedicated internet protocols for communication and exchange.

Checking a connection to a domain with the ‘ping’ command

Checking a connection to a domain with the ‘ping’ command

Web links are just one kind of uniform resource locator. They contain information about the domain where the resource to be retrieved is located (in the case below, www.ucc.ie) and, if necessary, the path to an individual item within the hierarchical directory structure.

Anatomy of a web link

Anatomy of a uniform resource locator: the case of a web link

You can also use the browser to send a command to the remote machine, in the second example, to search for and display information about a module with the code WL4102.

The internet has greatly modified our practice of reading and working with documents. It allows us to locate, document and retrieve sources of information, and in turn to embed connections to such sources in our own documents. These and other kinds of link can substantially expand the scope of your learning.

1.2 - Digital literacy

Your studies will give you plenty of scope to develop your digital skills

Digital literacy consists in how you understand and engage with the processes of living, learning and working in a connected society. In many aspects of our lives, we use digital tools, sometimes pervasively. Understanding these tools extends to knowing how they operate, how best to interact with them and how to use them productively in your life and work.

In the context of your studies, you will also develop both your general and your dedicated digital knowledge and capacity. In other words, you will acquire generic skills, relevant to information technology in general. You will also acquire more specific skills and knowledge — how to use applications relevant to languages and to text in particular, and how to use this knowledge in exploring language.

In reading these pages, you will be able to explore the latter in particular, by thinking about the kinds of technologies that are relevant to your work in learning about language and languages.

A whole spectrum of skills is involved: conceptual (e.g. understanding the relationship between a language and a script), practical (e.g. how to make the most productive use of a computer keyboard), technological (e.g. how to encode transcriptions using the International Phonetic Alphabet), and cognitive (e.g. understanding how technological literacy can improve your contribute to your data and information literacy).

Learning technologies

Learning types and learning technologies (YouTube)

 

If we adopt the perspective of the conversational framework of education, we can see how different digital technologies and tools can be mobilized, both within and beyond the classroom.

 

Learning process Digital tools and activities
Acquisition Virtual learning environment (e.g. Canvas),
online lectures
Inquiry Library and internet searches, research databases
Discussion Online classes (e.g. MS Teams), quizzes, discussion
boards
Practice Oral and written practice (e.g. word-processing,
creating video clips)
Collaboration Joint projects (e.g. Google Drive), online forums
Production Writing essays (e.g. Office 365, Google Docs, Libre Office)
Conversational framework for learning

Digital capabilities

The relevant digital skills have also been conceptualized as falling into six distinct kinds of capability, with a general understanding of information and communication technologies helping to connect them.

Six digital capabilities

Six digital capabilities (source: JISC)

These capabilities have been extensively documented, including their critical dimension and also their important implications for your digital well-being. When thinking about how you want to develop your digital literacy, consider also these tips. The effective use of digital technologies and tools is more and more a key path to becoming a proficient independent learner.

Reflect on your experience

It is a good idea to monitor and reflect on your experience of learning technologies: in the course of your studies, you are likely to encounter many different kinds of tool and application and it makes sense to think about how these help you in your work. Expect change: new resources and ways of accessing and using information are constantly being developed, and your own work is likely to evolve in parallel. Monitor and learn from your own use of these resources and think critically about their strengths and weaknesses.

1.3 - In class and outside the class

Make the most of your time in and out of class by using the learning resources available to you

Learning technologies have proliferated since the first development of the internet and as a result the boundary between work done inside and outside the classroom has shifted: for instance, as well as attending lectures in person, you can also view materials provided by your teachers in advance and then find that the time in class is devoted more to discussion and debate (this is the model of the “flipped classroom”). Time in the classroom can also be divided between traditional acquisition of knowledge in lecture format and interactions involving teachers and other students — sometimes using digital tools.

Technologies allow you to view and explore materials relevant to your learning, and to interact with your teachers and classmates. They can also be drawn upon in your individual work, for example, in your research and note-taking.

The range of learning channels

A further notable feature of higher education today is the variety of channels through which teaching can be delivered.

Teaching and learning technologies

Teaching and learning technologies (source: Daniel Stanford)

Asynchronous and synchronous

An important distinction is made between asynchronous and synchronous access to teaching and learning materials:

  • materials can be made available to you to engage with in your own time between sessions: this asynchronous approach will allow you to organize your work independently, meaning that you can work at your own pace and in ways that suit your learning style, while also taking into account any contraints that may affect you
  • weekly sessions are not limited to physical meetings in class: using synchronous tools like Microsoft Teams or Google Meet you can engage in discussion and debate with your teachers and classmates, again on specific issues and themes, with the possibility of sub-dividing the group for part of a given session to facilitate something more like face-to-face discussion

The diversity of tools and channels means that you can experiment with different learning styles and find the ones that work best for you in a given area of your studies. Not all of your work in learning demands the immediacy of the classroom and one of the notable capacities that you will develop is the ability to combine independent work with structured learning via a wide range of media.

2 - Research

Research tools and how to use them in your work

2.1 - Research and reading

Explore research questions

In your studies, you will encounter open research questions and in writing your essays you will engage actively with these. This will imply working outside and beyond what is covered in class: you will be encouraged to find and analyse relevant source, data and theories, and to draw on these in developing arguments and conclusions of your own.

Reading and taking notes

Once you have located relevant sources, your next task is to read them and to discover how the ideas and arguments they contain can shape your own work.

It is important to keep track of the ideas and arguments on which you draw, so that you retain a clear understanding of the substance of your sources. What this usually means is taking careful notes and you can make use of these too in documenting your sources.

Nowadays, many of the sources on which you will work are available in Portable Document Format. This means that you can if you like read and annotate directly these on-screen, using an application like Adobe Acrobat Reader (usually pre-installed), or PDFCandy (Windows), or Skim (macOS).

Annotating a PDF in Skim

Annotating a PDF in Skim

Working like this has the advantage that you always have the original source to hand and so can easily verify quotations and page references. You can also keep all of your sources for a given piece of work in one computer folder and more easily create a bibliography for a given piece of work.

Note-taking applications

Another strategy is to make use of dedicated note-taking applications. One example is Microsoft OneNote. The application can be installed on a computer and also on a mobile device, and you can also access your notes online once you have set up a user account.

A notebook in OneNote

A notebook in the OneNote desktop application

You can treat this digital tool as a surrogate for a paper notebook, in that you can click and insert material at any place in the “page”. You can also incorporate scans of physical documents and can record audio notes, among other facilities.

Discover what works best for you

The important thing about taking notes is to ensure that whatever process you use is productive for you. You may, for instance, find it more convenient to combine physical and digital notes, and some writers argue that the act of writing with pen and paper is more conducive to remembering the material that you have learnt. Experiment and discover which style, or styles, of learning works best for you in different contexts.

2.2 - Discover research materials

How to access primary and secondary resources

Through the library, you have direct access to books and other printed materials and also to a vast set of online sources to which the library subscribes. Your research will also lead you to explore a much wider of online collections, databases and corpora, using dedicated search strategies.

There is also guidance provided by the library on resources of particular relevance to language subjects.

Primary sources

A typical primary source is a text that is addressed in class and works by the same writer or comparable texts from the same context would also be primary sources. A collection of data is also a primary source, for instance, if you are writing about language change. And if you are writing about a historical or cultural moment, there is a still wider range of primary sources on which you could draw, including newspaper articles, film clips that include archival footage as well as more recent film and documentary material, websites containing archival material, podcasts and interviews in print. So, a primary source can be a text or artefact directly produced in the context that you are studying; in other words, it can be contemporary with the topic or period in question, or can be a literary text or other form of cultural production from this period.

In the case of primary sources, think carefully about the kind of material in question: to what end was this source produced? to whom is a given source addressed? What message or messages might a source have been intended to convey? How representative is a source as a means of documenting and understanding a period or historical event? Where relevant, comment on the medium in which a source was produced — e.g. a film, a news reel, a photograph, a poster, a mass-produced text or image, song lyrics (with or without original music).

Secondary sources

Academic books, essays and articles are typical secondary sources. You will receive guidance on relevant materials in class and you can carry out your own searches to locate further sources that help to you to address a particular issue in your own writing. You can initiate your searches from the library homepage.

OneSearch: locate secondary sources from the Library homepage

OneSearch: locate secondary sources from the Library homepage

A secondary source draws on one or more primary sources to provide a historical, interpretative or critical account of a historical period or a form of cultural production.

Make sure carefully to document the sources of any documents you use, primary or secondary: in your written work, you must give specific references to sources that you use. When reading sources, take notes to assist you in retrieving references that you will need in documenting your sources in your own written work.

In using secondary sources, aim to identify the main claims or conclusions made by the author or authors — are the claims and conclusions based on specific sources? How well founded do you consider the claims or conclusions in a secondary work to be in the light of the sources cited and of your own reading drawing on other sources?

Online resources

You can also avail of a range of important research tools, notably corpora, which are large scale collections of linguistic data, sometimes spanning more than one language.

These tools often provide interfaces that allow you to carry out detailed lexical investigations. It is a good idea to use a reference manager to keep track of resources like these (though it should be noted that these and other web addresses, or URLs, are liable to change over time.

Reference sources

You also have access to a wide range of authoritative reference resources and you should as a rule make use of these rather than drawing on webpages that happen to appear among the results of an internet research. One example of such a source is Oxford Reference Online, to which you have access through the library.

Open access

Alongside the many sources of knowledge and information to which the University subscribes, you can also make use of open-access materials, in other words, research sources that are available online free of charge and without restrictive licensing conditions.

Open-access is an increasingly important way of availing of current research and a wide range of open-access outlets now exists in several European languages. You can also explore open-access journals and in addition UCC maintains its own institutional repository of open-access research.

2.3 - Use a reference manager

How to record all of your research discoveries

A reference manager is a flexible tool: it can be integrated into word-processors like Microsoft Word, Libre Office and Google Docs, allowing you to generate citations, footnotes and bibliographies automatically. Its scope is wider still, in that it can also be integrated into a web browser, which means that you can record details of items that you find online, both articles that you find in a research database like JSTOR or individual webpages.

A reference manager is one type of note-taking application and it has the advantage over pen and paper that you can use it to record and re-use links to research resources. Try experimenting with a tool like Zotero for a week or two to see how it can help you in your reading and research, and in turn in the writing of your essays and other projects.

Experiment with Zotero

Zotero is a free and open-source reference manager with is available for Windows and macOS. The application provides users with a limited quantity of online storage free of charge; a subscription is required to retain larger volumes of material, though data can also simply be stored locally on your own hard disk.

Zotero: desktop application

Zotero: desktop application

Using a reference manager like Zotero, you can develop a database of references and materials over time, and use them whenever appropriate in your different projects. For each project, you can create a collection and you can also in turn share items between different collections as you develop new projects.

You can add an item by entering the relevant details and then saving it to your collection.

Adding an item to Zotero

Adding an item to Zotero

Integrating Zotero into your web browser

Zotero: web interface

Zotero: web interface

Zotero can also be used with a web browser, allowing you to record details of useful or interesting items in the course of your research — these can include webpages as well as articles or books.

In Google Docs, for instance, you can install Zotero Connector as an add-on, to automate the process of recording items.

Using Zotero Connector in Google Docs

Using Zotero Connector in Google Docs

Where a document, typically in the form of a PDF, is associated with a reference, you can also download this and link it to the Zotero record in your database.

Editing with Zotero: reproducing research references

One of the major functions of a reference manager is to allow you to insert references into an essay and to generate a bibliography of all of the items that you cite.

Using Zotero with Google Docs

Using Zotero with Google Docs

Once you install the Zotero Connector in Google Docs, you will see a dedicated button appear on the menu bar: you can then insert a reference to an item.

A reference created using Zotero

A reference created using Zotero

Select Document preferences… to specify the style you wish to use in the document in question.

Apply styles in Zotero in Google Docs

Apply styles in Zotero in Google Docs

The Zotero desktop application will then open: next, you can choose the relevant style standard.

In addition, you can use Zotero to insert a bibliography at the end of your document. By default, it will include all of the items you have cited in the bibliographical style that you have specified.

A bibliography generated with Zotero

A bibliography generated with Zotero

Zotero with Microsoft Word and Libre Office

It is also possible to use Zotero with other word-processors, including Word and Libre Office.

2.4 - Research tools in languages

Work with dedicated tools for languages

Today, you can have access to a wide range of tools that allow you to analyse and interrogate linguistic material as data. Some of these resources are online tools, though which you can access or analyse texts or corpora of texts. You can also avail of free and open-source tools that you can install directly on a computer to which you have access.

Voyant Tools

This is a web-based application that allows you to analyse digital texts, normally in plain-text format (in other words, a text that includes only readable characters, and not any “rich text” elements, like different font styles and other elements of text formatting associated with word-processed documents).

Input text in Voyant Tools

Input text in Voyant Tools

Once you have uploaded a text, you will be presented with a dashboard incorporating a series of tools.

Voyant Tools in action

Voyant Tools in action

  • Cirrus, or word clouds as indicators of word frequency
  • a Reader which gives the full text being analysed (in this case, an interview with François Mitterrand soon after he became President of France in 1981, obtained from the Oxford Text Archive)
  • a set of Trends that displays the distribution of recurrent terms by segment of text
  • a Contexts tool that presents key words in context in the style of a concordance

The word cloud can be exported, either in the form of a URL or as an image file which you could then, for example, use in a class presentation.

A word cloud in Voyant Tools

A word cloud in Voyant Tools

Sketch Engine

Sketch Engine combines text analysis with tools that allow you to access and to create corpora. You can access Sketch Engine using your UCC logon credentials (this access is provided on the basis of European research funding and lasts at least until March 2022).

Getting started with Sketch Engine
 

Once you select a corpus to work on, you can analyse words or strings of words using various tools, including the Word Sketch, which allows you to analyse the grammar of a term on the basis of its usage.

Accessing Sketch Engine

Accessing Sketch Engine

Take the simple English word “book”, which can of course be a noun or a verb. A learner of English can begin to see some of the contexts in which it is used as either of these parts of speech.

A word in Sketch Engine

A word in Sketch Engine

Among many other options, Sketch Engine can be used to generate corpora of your own. If you are interested in art, for example, you can construct a corpus from web sources in a language that you are studying.

Creating a corpus

Creating a corpus

You will need to do some preliminary research on relevant online sources before finalizing your corpus. Once it is completed, you can use all of the tools in Sketch Engine to identify key terms (e.g. through frequency counts), understand how they are used in context (e.g. through concordances) and explore a term’s syntax (e.g. through concordances or word sketches).

Desktop tools

There are a number of ways in which you can carry out similar work using a computer to which you have direct access and on which you can install software. Two instances are the various open-source tools developed by the linguist Laurence Anthony and LancsBox, a comparable set of tools developed by specialists in Lancaster University.

The advantage of working with tools like these is that you can learn directly about language analysis by exploring the various options that they provide.

Using LancsBox
 

2.5 - How to search a library catalogue

Using library catalogues as research tools

A library catalogue contains a very considerable amount of information. The simplest way to search a catalogue is by keyword. But there are several more focused ways also to search for an item that might be of interest to you.

You can initiate a search directly from the library homepage.

Search by keyword

Search by keyword

Because the library catalogue is in a different web domain, the results will appear in a new browser tab.

A keyword search

A keyword search

This search is very broad in scope: it generates nearly 21,000 results. But you have the means to locate materials more precisely by using other search options.

Search options

Search options

If you know the author of a work, or indeed its title, then you can search for these directly.

A search by Subject is different from a keyword search in that it uses a controlled list of terms.

A catalogue entry with subject headings

A catalogue entry with subject headings

Here is the entry for the first item in the long list generated by a keyword search. You can see that it has specific subject headings associated with it. Once you can identify subject headings relevant to the topic that you are interested in, you can use these to conduct more specific searches and to identify cognate items.

Note too that if you follow the link in a subject heading, you can explore other ones that might be relevant to your topic.

Cognate subject headings

Cognate subject headings

All library catalogues also provide for the option of an Advanced search. By this means, you can combine search elements and also limit the scope of a search, for example, by date or by language.

An advanced search page

An advanced search page

Search in databases

All of these search options are available to you also when you search in databases to which you have access through the library. JSTOR, which is a digital library of academic journals, books and some primary sources, is one such resource.

Search options in JSTOR

Search options in JSTOR

The default option is search by keyword, with the more specialized options, including some, like Text Analyzer, which are specific to JSTOR, being listed at the head of the page.

3 - Writing

How to produce well-presented and well-edited documents

3.1 - From research to writing

In your work, you will find that research and writing are closely integrated

Writing essays is a central element of your work in languages, one that draws on all of the knowledge and all of the skills that you will acquire in the course of your degree. It is in itself a process of discovery.

What is a critical essay?

A critical essay is a means of advancing an argument, of making and justifying claims. A critical essay is shaped by and engages with existing knowledge of the material on which you’re working. A critical essay is objective and responsive all at once. A critical essay is always well documented, using the relevant style conventions. A critical essay is an exercise in persuasion more than it is the rehearsal of a personal point of view.

The essay as genre

The essay is a literary genre… The practice of writing is intrinsic to cultural debate and the essay is one of its central genres. Many influential writers on literature and culture rely on the essay as a means of intervening in contemporary society.

The essay is also a scholarly genre that informs and persuades. It combines information and argument, evalation and research. Because it draws on these distinct kinds of work, essay writing is your means of participating in informed debate. Writing, therefore, implies research and research implies immersion in sources, both primary and secondary.

Your essay, because it draws on primary sources, and on critical treatments of these, is like a scholarly work, and an important feature of the treatment of sources is that it is always explicit in literary essays.

Good writing

Because essays are as a rule relatively brief, they are often examples of writing that is at once polished and forceful. Good writing implies revision, refinement and dialogue: your own essays will form part of your debates with your teachers.

Research and sources in the essay

Research will form part of your work. Your essay is researched and written with a specific focus in mind, as defined by the topic on which you choose to write. This is what gives your essay its primary purpose — to give a thoughtful, engaged and engaging account of a specific set of issues. In your research as well as your writing, it is important always to ensure that what you have to say remains relevant to the topic. Make sure that you use reliable sources, especially secondary sources that are accurate as well as relevant.

An essay demands close reference to your primary sources. One of the important skills that you will acquire is to recognize just what elements of the text are relevant to the topic and that allow you to develop your own response to it. This is where you should make precise and telling references to specific parts of the work (always giving page references when you do so).

Quotations

There will be occasions when it is appropriate to quote from a text. For instance, you may want to discuss a particular passage closely, in which case it is useful to have the text in front of you and your reader. But remember that a quotation of itself does not necessarily demonstrate a point and make sure that the purpose of a given quotation, short or long, is always clear.

Quotations take a number of forms. An inline quotation is a brief extract from a source which is given in quotation marks, either within a sentence or introduced by a colon. A block quotation is a longer extract — more than forty words or so — and is again introduced by a colon, and presented as a separate indented paragraph.

Get your references right

In any academic work, you are expected to document your sources. This is the reason why it is important when doing your research to take careful note of the information, explanations, ideas and arguments on which you wish to draw in your essay. Your essay should always include references in the form of a footnote, and a complete list of all of the sources on which your argument depends in a bibliography at the end of your essay.

Quotation and citation

Be explicit in referring to your sources. Direct reference must always be made to a source on which you rely for information, explanations, ideas or arguments. In the case of quotations, this practice should come very readily to you, as you are making direct use of a source. There will also be occasions when you will cite a source without quoting from it. In these cases too, you must always include direct references to any source on which you rely for information, explanations, ideas or arguments.

3.2 - Structured documents

Learn to make the structure of your documents explicit

Writing forms an essential part of your work as a student of languages. Many of the different kinds of work that you will do use writing as their medium: research and taking notes; presentations; essays and dissertations; translations and other language exercises. Even day-to-day communications depend massively on keyboards and screens — think of email and all social media. Much of the writing you will do as students will involve the use of information technologies too and almost any variety of academic writing can make use of such tools.

An essay is a structured document

Take essay-writing as an instance. You will write many essays in the course of your studies and almost invariably these are nowadays produced using dedicated information technologies, typically word-processors. In some cases, they may be printed and read on paper; but equally they will be circulated electronically and read on screen. In either case, it pays to think about presentation as well as content — and to regard this as an aspect of the careful and precise thinking about language that is at the forefront of your work. In structured documents, these two aspects are interdependent.

Here is an image of what an essay might look like, using some placeholder text.

Elements of document and page design

Elements of document and page design

Several items of the structure of the document are immediately apparent. It contains body text, that is successive paragraphs of content, each of which has the same unjustified format. The first paragraph on the page is preceded by a heading in a different style. There are two block quotations, one of which appears to contain three lines of verse. The document contains two footnotes which are linked to the two block quotations, again in a different style, namely a smaller font size. These are the main visible elements of the design of the document. In addition, you can see two further elements that form part of the page design: a header and a footer, each containing information that will be repeated from page to page.

Make the structure of your document intelligible

These are some of the main components of a typical essay and an application like a word-processor allows you to make the structure of your essay explicit.

Using word-processors and other tools, you can mobilize typographical conventions to make the structure of your document clear.

A LibreOffice file

A LibreOffice file: running header and heading in small caps

Here, for instance, what are termed small capitals are used to denote headings as a structural element of the document. They are also used for headers and footers. Using the same style for all of these components makes the structured hierarchy of your document more intelligible, without multiplying visual variants to excess.

3.3 - Word-processors

Using word-processors to generate structured documents

The most widely used word-processor continues to be Microsoft Word, which is a well-documented application.

You can also consider using LibreOffice Writer for your work in languages for a number of reasons:

Both Word and LibreOffice Writer are typically used as applications installed in your computer. By contrast, Google Docs allows you to create, edit, share and collaborate on documents through a web browser. If you have an Office 365 account, you can also write and edit online in Word.

LibreOffice is an instance of free and open-source software. Google Docs is free for personal use, but not open-source.

Document design using a word-processor

The most important single aspect of the design of your document is body text, in other words the paragraphs and other blocks of text, like quotations and bibliographies, that make up a written piece. The key consideration to bear in mind is to ensure that your content is clear and legible throughout. Among the most important factors are line length and spacing.

For an item like an essay, line length should be about eighty characters in total, including spaces: this makes it easier for the eye to move from line to line. For this reason, you should ensure that your document has generous left and right margins (up to 4 cm).

In Word, you can adjust these settings by choosing File > Page Setup > Margins…

Spacing and indentation

Spacing and indentation

You can adjust line spacing in Word by choosing Format > Paragraph. As line length increases, so should the spacing between lines. Line spacing is the measurement between the baseline of one line of text and the next. The gap between the descenders (in characters like p or q) on one line and the ascenders (in characters like d or l) on the next is traditionally referred to as leading (to rhyme with “heading”). If your font size is 11 pt, then this setting should be at least 13.5 pt. For a font size of 12 pt, your spacing should be 14.5 pt.

In written work like essays, the best way of marking a new paragraph is to apply a small indentation to its first line. Here again, you can ensure that your word-processor applies this style automatically. Choose Format > Paragraph, and set left indentation at the same measurement as your font size, e.g. 11 pt (Word will automatically convert this dimension into your selected default unit of measurement, e.g. to centimetres). You should also set spacing before and after each paragraph to 0 pt. By default, most word-processors add extra line spacing at the end of a paragraph. If you choose to follow this alternative style, you should set this additional space to the same dimension as your font size, e.g. 11 pt, or the equivalent of a blank line. But do not combine paragraph indentation with blank lines between paragraphs.

Inline elements

Considerations of style apply also to parts of your writing that arise within sentences and paragraphs. These are referred to as inline elements and include brief quotations, for instance, which are identified through the use of quotation marks. Another instance is the title of full-length works to which you may refer, e.g. Vita nuova, Faust, Cré na Cille, Les Années, Don Quijote.

Using styles

The best method of ensuring that these and other design preferences are applied consistently to your whole document is to use styles. These are formatting options that are built into each word-processing application. To access styles in Word, select Format > Style.

Modifying styles in Microsoft Word

Modifying styles in Microsoft Word

The style that applies generally to body text is Normal, which you can modify in line with your design decisions. Separate styles exist for each of the components of your essay, e.g. block quotations, for which the default style in Word is Block Text (note that you are likely to have modify this style substantially, as by default it presents quotations in italics, which is not an appropriate choice in your essays.)

Once you have formatted the required styles as necessary, you can apply them in the course of writing your essay by selecting Format > Style, and choosing Apply once you have identified the relevant style.

Why use styles?

One of the major advantages of using styles is that any changes you make to an individual format, e.g. block quotations or footnotes, are automatically applied to all instances in your text. You can also apply the same styles in more than one essay: once you have make the necessary modifications, they are available for re-use in each new document.

Styles in LibreOffice

LibreOffice is, like Word, extensively documented — search, for instance, for styles. Styles can be edited and applied in LibreOffice writer in much the same way. Select Styles > Manage Styles to identify an individual element and to apply your modifications to it.

Styles in LibreOffice

Styles in LibreOffice

For instance, you can opt to have numbered headings throughout your essay if this choice is appropriate to your content.

Styles in Google Docs

Styles can readily be applied also in Google Docs. Once you have logged on to your Google account, you can retrieve your documents and edit them from within a web browser. A file appears on screen in document format.

Styles in Google Docs

Styles in Google Docs

You can apply a format to a given element of your document using the editing tools displayed on screen, and then apply that in turn to a given style for use in future documents.

It is also possible to apply more specialized templates by incorporating Add-ons into your Google Docs profile. Here, for example, a dedicated MLA formatter has been added to a Google account for use with a given document.

Apply MLA style with an add-on

Apply MLA style with an add-on

Google Docs also allows you to modify your default fonts just as you can when using a word-processor application.

Selecting a font in Google Docs

Selecting a font in Google Docs

If you select More fonts, you are given the option of using additional fonts from within Google Fonts.

Options in Google Fonts

Options in Google Fonts

Note that you can specify which script you wish to use in selecting a given font style (here, the choice has been restricted to serif fonts).

3.4 - Creating and using PDFs

A PDF can be used in print and on screen

A file in in Portable Document Format is a printable document that presents contents, images and other elements consistently from device to device. This means that such a file appears identically irrespective of the machine on which it is opened. This format is now also used to create archives of digital documents.

Generating PDFs

Any word-processing application can be used to create a PDF. In Word, select File > Print, or press CMD+P (Windows) or CMD+P (macOS). Then, in the PDF tab in the bottom left-hand corner, select Save as PDF.

Save as PDF in Word

Save as PDF in Word

You can save the file under an appropriate name (preferably not essay.pdf so as to ensure the work in question can be clearly identified at a later date). Note that you can incorporate metadata into the PDF file by completing the boxes in the lower part of the screen.

If you opt to create a PDF in Google Docs, by selecting File > Print, you can opt to create a PDF following the same procedure as in Word.

To create a PDF in LibreOffice, select File, Export as PDF > Export as PDF….

Save as PDF in LibreOffice

Save as PDF in LibreOffice

Several options can be applied before exporting and saving the file as a PDF.

Advantages and disadvantages

Portable Document Format is a very widely used computer application and because it is an archival format it is likely that material in a PDF will be accessible for the long term.

There are a number of advantages to using a PDF as the file format in which to save a file

  • a PDF can generate high-resolution print output
  • a PDF is also an online format and can be viewed within a web browser
  • content and presentation in a PDF are fixed, which means that it will appear identically in each machine in which it is opened (in other words, a PDF is “device-independent”)
  • because your chosen fonts are embedded in a PDF, the printed output will be consistent from printer to printer
  • a PDF can be used to create a file with information which you may want to circulate in a secure format and where you wish to retain the original formatting, for example, in a curriculum vitae
  • a PDF can be used as a presentation format: create and save your file in landscape format and open the file in full screen
  • it is possible to customize a PDF when creating so as to control its appearance on screen when opened
  • a PDF can be annotated, which makes it a useful research tool
  • PDF files can be combined to create a portfolio

There are one or two disadvantages to using PDF as a format

  • a PDF file can as a rule not be modified; if you wish to change the content, the routine option is to edit the original word-processor or other file, and then resave it as a PDF
  • PDFs require dedicated software, though such applications are now very widely available

Hybrid PDFs in LibreOffice

LibreOffice provides the option when exporting to PDF format of embedding the original word-processed file. This means that an original word-processed file can be directly associated with the PDF you generate from it.

LibreOffice: embed ODF file in PDF

LibreOffice: embed ODF file in PDF

If you check the relevant box and save the PDF in this way, you then have the option of dragging and dropping the resulting PDF into LibreOffice in order to access the original Open Document Format file, allowing you to edit the original content — an approach that could be useful if, for example, you are collaborating on a file with another person.

3.5 - Templates

How to use templates in word-processing applications

All word-processing applications allow you to make use of templates in order to maintain a consistent style from document to document. Here are some examples to get you started.

Additional fonts

First, install the following fonts on your machine using these links to download the relevant compressed .zip files:

The Noto CJK fonts allow you to combine Latin script with characters in Japanese, Korean, Simplified Chinese or Traditional Chinese.

Once you have unzipped the downloaded files, you can then install the fonts.

Using the templates

Next, you can download these two templates, all of them designed for use with LibreOffice:

You can also obtain an MLA template for Word directly from Microsoft.

Accessing CJK character sets

Note that you will have to choose which precise Noto CJK font you wish to use according to the language in which you will be writing — here, Simplified Chinese is selected.

Selecting a CJK font in LibreOffice

Selecting a CJK font in LibreOffice

3.6 - Revision history

How to track versions of your work over time

In both Word and Google Docs, it is possible to review and re-edit earlier versions of your work. This is possible in both cases because your files are saved in the cloud (provided that any document that is opened and edited in the version of the Word application on your computer is also saved in OneDrive).

One document, many uses

Version history in Google Docs

Version history in Google Docs

This facility allows you to retrace your steps when you need to do so, for instance when you delete or amend a piece of writing which you then wish to recover. It also means that you can use different versions of a single file for different purposes, for instance a curriculum vitae: in Google Docs, you can assign a specific name to a given version of your work.

Version history as backup

These forms of version history also function as a rudimentary backup for your work. Because all of the versions are saved in the cloud, you should be able to recover your files in the event of a hard disk failure on your computer.

Version history in Word

Version history in Word

To access version history in Word, choose File, and then Browse Version History.

3.7 - File formats

A file’s format is expressed in its file extension

Files store information in particular formats, according to their function and often also the application (or applications) with which they are associated. So, you can open an image file with an application like Preview on macOS, or Adobe Photoshop.

Word-processor files

A file format is typically designated by the file’s extension, which takes the following format: .xxx. In other words, the extension contains metadata, or information about the file, typically its file type and the data it contains. A Word file is a text document produced using a word-processor and has the extension .docx.

This is also an Open XML format, which means a Word file can more easily be transferred between computers and also transformed into other document formats. It also makes it compact and modular in its structure. Because information about the file is stored together with its contents, it will open with the same appearance in a different user’s computer.

A LibreOffice file with the extension .odf is also a word-processing file. LibreOffice makes use of the OpenDocument Format. This format supports structured documents that are interoperable. This means that ODF files can be read on machines using different operating systems and with a range of different applications other than LibreOffice.

Both Word and LibreOffice store all of the information associated with a given document in the form of a compressed archive file (usually a .zip file).

Contents of a LibreOffice template file archive

Contents of a LibreOffice template file archive in .odt format

As well as the contents of the file, this format contains information about the styles used in a given document as well as its metadata. All of this data is in the form of .xml files, a format designed for the exchange of information between computer applications. The eighteen files of which this example is made up also include an image file in .jpg format embedded in the document — it is the largest file by some margin. Finally, because these files also contain information about the configuration of the LibreOffice application in which the document was created, it too will be retrieved with exactly the same layout when opened in a different machine.

Word and LibreOffice files can also be used with Google Docs.

Plain text formats

Several file formats exist in which textual information can be encoded more simply than in a word-processing files. These are referred to as “plain text” formats because they contain only text. The simplest of these formats is a .txt file, whose contents include none of the formatting found in word-processor files.

In Windows, you can open a .txt file using the Notepad application and in macOS with TextEdit. Both of these are plain text editors. You can also download and install other such editors, like Atom or Visual Studio Code.

Plain text files are used in the creation of corpora and you can download copies of works of literature in this format from Project Gutenberg to create a corpus of your own, or with web applications like Voyant Tools.

Binary document formats

From these examples we can conclude that the contents of a document can exist within files of different formats and that it is possible to transfer the content between formats, so that the document can then be edited using different applications. Google Docs can be used to import and can export files in a range of formats.

It is also possible to generate a file in a binary format from a word-processing document. The most common outcome of such an operation is a Portable Document Format file with the extension .pdf. Files in Word or a LibreOffice or Google Docs can be used to create a PDF.

Source  Adobe Document Cloud

 

A binary file is in a file format that contains digital data not in the form of plain printable characters. While it is now possible to transform a PDF file into a format in which it can be edited in Word, for instance, a word-processing application cannot be used to open the digital PDF file directly. Instead, you must use a dedicated application that will retain the exact appearance of the document contained in the file, like Adobe Acrobat Reader.

Because such applications exist for all operating systems and for mobile devices as well as computers, a PDF file is said to be “platform-independent”.

There are also dedicated open-source applications like AntFileConverter, which allow you to convert word-processor files or PDFs into plain text for use with corpus tools.

Image files

Most image files are also binary files containing digital data. Typical formats include .jpg, .png and .tiff files. In both .jpg and .png files, some of the binary data is compressed. On the other hand, both .png and .tiff files are “lossless”, which means that the original data can be fully reconstructed from a compressed file.

Font files

A digital typeface is typically made up of a series of font files that contain the information required to render different individual fonts, e.g. that typeface’s roman, italic and bold styles or weights. An OpenType font file contains the information needed to allow the font to be displayed using computers with different operating systems, e.g. Windows or macOS. OpenType fonts can also contain extended Unicode character sets and a range of typographic features, e.g. small capitals, non-lining numerals.

TrueType is a further scalable font format that can be accessed both in Windows and macOS. Web Open Font Format makes use of compressed font files for display in web applications.

Common file extensions and formats

.css
a file in Cascading Style Sheet language that controls the appearance of an HTML file on screen, in print and in other media
.docx
a Word text document
.epub
an e-book file for use with most e-book readers (note, however, that a Kindle uses files in .azw or .azw3 format); as well as text, it can support color images, graphics, interactive elements, and video files
.html
a document in Hypertext Markup Language designed for display on the web
.jpg
a compressed lossy image format
.key
a macOS presentation file
.md
a Markdown document; a plain text format
.mov
a QuickTime video file
.mp3
a lossy audio format
.mp4
a video file format
.odf
a LibreOffice text document
.otf
an OpenType font file
.pages
a macOS text document
.pdf
a printable document in Portable Document Format that presents contents, images and other elements consistently from device to device
.png
a lightweight lossless image format
.pptx
a Microsoft Powerpoint file; also an Open XML Format
.rtf
a document data format used for saving and sharing text files
.tiff
a lossless image format
.ttf
a TrueType font file
.txt
a plain text document
.wav
a lossless audio format
.woff, .woff2
Web Open Font Format font files
.xml
a file in Extensible Markup Language used for storing and transporting information
.zip
an archive file format that supports lossless compression of files or directories

4 - Tips

Shortcuts and techniques to make your work easier and more efficient

4.1 - Use the keyboard, not the mouse

You can use the keyboard to activate commands in any application

Many computer users find it more convenient to use the keyboard rather than a mouse or other device to enter commands. You will find that almost any aspect of the operation of an application can be controlled using keyboard shortcuts.

System-wide shortcuts

Basic operations can be performed by means of shortcuts.

Command Shortcut
Copy CTRL+C (Windows) or CMD+C (macOS)
Cut CTRL+X or CMD+X
Paste CTRL+V or CMD+V
Select all CTRL+A or CMD+A
Undo an action CTRL+Z or CMD+Z
Redo an action CTRL+Y or CMD+Y
Save CTRL+S or CMD+S
Save as CTRL+SHIFT+ S or CMD+SHIFT+ S
Print CTRL+P or CMD+P
Close the active document CTRL+F4 or CMD+W

If you find it convenient to use the keyboard to carry out operations like these, you will find a much fuller range of options in Windows and in macOS.

Shortcuts in web browsers

These commands function both in Firefox and in Chrome.

Command Shortcut
Open new window CTRL+N or CMD+N
Open new tab CTRL+T or CMD+T
Go to the top of the page CTRL+↑ or CMD+↑
Go to the bottom of the page CTRL+↓ or CMD+↓
Print CTRL+P or CMD+P
Reload CTRL+R or CMD+R
Find in page CTRL+F or CMD+F
Highlight current address CTRL+L or CMD+L
History CTRL+H or CMD+H

Shortcuts in word-processors

Many of the system-wide shortcuts function in word-processors and Word, LibreOffice and Google Docs all many additional commands that can be activated through the keyboard.

Here are some useful ones to keep in mind in using any of these applications.

Command Shortcut
Open a document CTRL+O or CMD+O
Open a new document CTRL+N or CMD+N
Edit link CTRL+K or CMD+K
Find CTRL+F or CMD+F
Italics CTRL+I or CMD+I
Bold CTRL+B or CMD+B

4.2 - Accented and special characters

Keyboards and applications make it possible to enter special characters in Latin script

You will produce plenty of written work in the languages you are studying and will also find yourself quoting from primary sources in essays written in English. One of the hallmarks of good writing in your studied languages is precision in the treatment of characters that are used with accents or diacritics.

Special characters using the keyboard

Accented characters in Latin script can be obtained by combining keystrokes.

Windows macOS
é CTRL+', e Option+e

The same combination of strokes with other relevant characters gives further accented versions, as in á, or í, or ó, or ú.

Windows macOS
É CTRL+', E Option+E

The same combination of strokes with an uppercase character gives an accented capital letter.

Here are further combinations in Latin script, giving accented or other special characters used in European languages.

Windows macOS
à CTRL+`, a Option+`, a
û CTRL+SHIFT+^, u Option+^, u
ï CTRL+SHIFT+:, i Option+*, i
ã CTRL+SHIFT+~, a Option+n, a
æ CTRL+SHIFT+&, a Option+'
œ CTRL+SHIFT+&, o Option+q
ç CTRL+SHIFT+,, c Option+c, c
ß CTRL+SHIFT+&, s Option+s

Again, the same combinations can be used with other characters where necessary, and uppercase versions can also be generated as in the example of É above.

In macOS, a further method of inserting accented characters is to hold down the key for the character in question. You will then see the several available accented versions displayed.

Available accented characters in macOS

Available accented characters in macOS

You select the relevant option, here number 2, and the correct character appears.

Punctuation and quotation marks

Latin script incorporates many specific forms of these marks used in different European languages. Here are some of the most common of these.

Windows macOS
« CTRL+SHIFT+, Option+\
» CTRL+SHIFT+. Option+SHIFT+\
CTRL+, Option+SHIFT+,
CTRL+. Option+[
¡ ALT+CTRL+SHIFT+! Option+1
¿ ALT+CTRL+SHIFT+? Shift+Option+?

Resources of a keyboard

You can verify all of the different key combinations that a keyboard may contain, both in Windows and in macOS.

In macOS, you need first to make sure that you can access the keyboard. Open Keyboard is System Preferences and select Input Sources. Here, check the box labelled Show Input menu in menu bar. You will see either a language flag or a keyboard icon appear in the top right-hand corner of the menu bar. Click here and select Show Keyboard Viewer.

Keyboard Viewer in macOS

Keyboard Viewer in macOS

Here, you can see all of the characters that can be generated by combining Option with a given key in the Irish-English keyboard. You can also see which characters can be obtained by combining Option+Shift.

Option+Shift keyboard options

Option+Shift keyboard options

In Windows, you can install or select a given keyboard in the Taskbar at the bottom of the screen. If you then activate the keyboard icon, the available characters will be displayed.

A French keyboard in WIndows

A French keyboard in WIndows

Inserting symbols in applications

If you are unsure of the method of obtaining a given character via the keyboard, there are other means by which you can do so using an appllication like a word-processor. In Word, for instance, in the Insert ribbon, you can selected Advanced Symbol.

Accessing the symbol menu in Word

Accessing the symbol menu in Word

This will bring up a display like the one illustrated here, where you can locate and select the character you need.

Selecting a character in Windows

Selecting a character in Windows

This is a useful means to insert a character when you may know already know the key combination required to generate it.

Note also that Windows allows you to use further shortcuts to enter characters, as shown here, where you can type 0132, followed by ALT+X, and see the IJ digraph appear. Use the Advanced Symbol display to discover other shortcuts that may be useful.

4.3 - Learn to touch type

There are many advantages to being able to touch type
typing.com (screenshot)

Learning to type online: typing.com

Touch typing involves using all of your fingers to enter text via the keyboard: it is as simple as that. Computer keyboards have what are termed home keys: there are slight raised edges on the F and J keys to enable you to position your hands in the right position for touch typing.

Typing is relevant to almost all aspects of your work: searching a library catalogue, using a word-processor to write an essay, communicating via social media, exploring the internet all involve using the keyboard. The ability to touch type will greatly facilitate all of these and other kinds of activity, online and offline.

Advantages of touch typing

Touch typing makes an immediate and practical difference in several ways:

  • you can write and compose more rapidly and more accurately using the keyboard
  • you can view the screen rather than they keyboard while you type, which helps concentration and flow as you write
  • it also makes it easier to edit your work as you progress
  • touch typing is less tiring and less likely to be harmful to your hands and wrists
  • the ability to touch type may be useful to you in your future work as well as your studies

Tools for touch typing

There are many online tools that allow you to learn and to practice your typing. One of these is typing.com.

Home keys: learning to type online

Home keys: learning to type online

Using sites like this or dedicated applications, it is possible to make rapid progress.

Keyboards and languages

The predominant keyboard layout in use today is QWERTY, which is dedicated to typing in English. When typewriters were in use, different layouts existed for different languages: these were designed with letter frequencies in mind and with the need for special characters. Thus, the traditional French keyboard is known as AZERTY.

French keyboard layout

French keyboard layout

As you can see, there are dedicated keys for accented characters.

Spanish ISO keyboard layout

Spanish keyboard layout

A Spanish keyboard follows the QWERTY layout, but with dedicated keys for accented characters and puncuation marks.

It is possible, but always not very convenient, to learn to use more than one keyboard. Alternatively, you can learn to use specific techniques on a Windows or Mac or other computer to enter special characters.

Wubi Xing keyboard

Wubi Xing keyboard

On the other hand, a dedicated keyboard can make it more practicable to enter text in a distinct script, like Simplified Chinese.

4.4 - Smart typography

Learn how to use all of the resources of a typeface

The Latin script is the most widely used in the world today and encompasses several blocks in Unicode. This means that it encompasses, for instance, the diverse punctuation signs and quotation marks used in different European languages.

Quotations marks

Quotation marks

Quotation marks

It is important when you use a font that you use “smart” rather than “straight” quotation marks. In Word, you can automatically ensure this happens by selecting Word > Preferences > AutoCorrect > Auto Format As you Type, and check the box that says “‘Straight quotes’ with ‘smart quotes’”. Note that in the example above quotation marks are aligned with the ascender line.

In general, you should follow the conventions for quotations marks in the language in which you are writing. So, if you are writing an essay in English, you should not use guillemets for quotations in French. Conversely, if you are writing in French, you should substitute guillemets for single or double quotation marks throughout.

Guillemets

Guillemets

Dashes

It is important also to distinguish between hyphens and dashes.

A hyphen is used to connect two words used a phrasal unit: “a nineteenth-century novel” (as distinct from “the novel is a major genre in the nineteenth century”), or in multi-word expressions in different languages, e.g. “bien-être”.

Hyphens and dashes

Hyphens and dashes

Dashes have a different purpose. An en-dash is used with ranges of numbers, e.g. 11–22, or to express the sense “between”, e.g. “the Franco–Prussian war”. An em-dash is used as a punctuation mark to separate parts of a sentence more decisively than with a comma.

If you are interested in languages, you need to know about characters — and Unicode.

Em-dashes can also be the equivalent of parentheses:

Madame Bovary — first published in 1857 — was soon translated into English.

They are also used in French to introduce direct speech:

— « Félicité ! la porte ! la porte ! »

4.5 - Working with links

Embedding hyperlinks in your documents

A hyperlink is an element of a document or webpage that links to a website or webpage, another document, or to part of the same or another document. Hyperlinks are key building blocks of the internet.

Wherever you use an online source, you should provide a link to it. To go automatically to the address bar of a browser and highlight the current address, press CTRL+L (Windows) or CMD+L (macOS); you can then press CTRL+C or CMD+C to copy the link, so that you can insert it into your own document.

At its simplest, a link is a current web address as displayed in a web browser.

You will also find yourself using other types of link. A persistent URL is a link that is designed to be permanent and is often a feature of library catalogue pages.

Persistent link in a library catalogue page

Persistent link in a library catalogue page

Where one is provided, you should always make use of a persistent link.

A digital object identifier (DOI) is another important kind of link: it is a persistent identifier for a published item and should therefore be used in your bibliographies or lists of works cited.

A digital object identifier in an academic journal

A digital object identifier in an academic journal

Links can be used to document your sources. Make sure you follow the relevant style conventions in incorporating them into your work.

4.6 - What is a snippet?

Discover how to automate the entry of frequently used words or phrases

A snippet is a useful shortcut: a short string of text expands into a word or phrase that you use frequently in your work. On a Mac, for example, where snippets are enabled the words “On my way!” will appear by default in the application that you are using if you type “omw”.

In macOS, you can create such snippets by opening System Preferences, and then select Keyboard > Text. You can then create snippets of your own, for example, your name, or your email address, or your degree programme, but also titles of books to which you refer often in an essay or a recurring phrase or concept, e.g. “gmm” for “grammaticalization”.

Using dedicated tools

Dedicated tools can be used in more extensive and flexible ways to create and use snippets.

aText: an expansion tool for Mac and Window

aText: an expansion tool for Mac and Windows

Here are some default examples in the application aText, to which you can add other more specific ones of your choice.

4.7 - Taking and using screenshots

Make use of screenshots as illustrations in your work

You can use a computer to make a screenshot of material that you wish to analyse or discuss in your work.

Windows now includes an application named Snip & Sketch: to activate it, press the Windows logo key+SHIFT+S. You can then access a range of tools to capture and annotate any part of the screen that is of interest to you. To obtain a capture of your whole screen, press the PrtScn key.

In macOS, the simplest method to make a screenshot is to use the built-in software that allows you take shots as follows:

  • CMD+SHIFT+3: capture a shot of your desktop. An icon will appear on-screen: click on this and you can then annotate or crop the shot as you wish
  • CMD+SHIFT+4: capture a shot of a specific portion of your screen by dragging the crosshairs that appear over the desired area

File format

Wherever possible, you should capture screenshots in .png format: this is a lossless format and therefore is likely to result in a higher-resolution file.

Dedicated applications

You can also make use of dedicated applications to capture screenshots.

References and fair use

You should always include a caption, giving your source and including a link where relevant.

When using screenshots, you should bear considerations of “fair use” in mind. The source of the screenshot should always be given in any work where you reproduce it. The creation of a screenshot for personal use is generally considered to be fair use, but you should take care to check carefully any limitations that a given site or other resource may place on making copies though screenshots or other means.

4.8 - Search the internet

How do you locate the information you seek on the internet?

The most common kind of search is via a search engine querying the public internet. Google is the most commonly used search engine. DuckDuckGo is an instance of a search engine that allows greater scope for privacy, in that it does not track users and does not store personal information.

Search tips

  • to search a specific site, use the following method: CJK site:unicode.org (note that you do not need to specify the protocol (https) nor include the www prefix)
  • to search for exact matches of a specific phrase, use quotation marks: "detective fiction", or "Victor Hugo" drawings
  • you can use wildcards to extend the scope of a search: i think therefore *
  • wildcards can also be used with search terms: gramm* will return results with “grammar”, “grammatical”, “grammaticalization”, and so on
  • a wildcard will also return terms with variant spellings: characteri?ation will return “characterization” and “characterisation”
  • to limit your search to resources with a specific page title, use this method: character intitle:unicode
  • to find out which pages link to a site you are interested in, use the following method: link:europeana.eu (in this case, you can locate versions of the site in question in different European languages)
  • you can limit your search by specifying a file type: "Victor Hugo" drawings filetype:jpg
  • you can search social media by using the relevant prefix as follows: @twitter unicode

You can also use what are termed Boolean operators in your searches:

  • novel AND fiction: search for material where both terms occur
  • novel OR fiction: search for material which contains either or both of the terms (thus, wider in scope that a search with AND)
  • novel NOT fiction: search for material that contains the first term but does not contain the second

You can also exclude a term by using a hyphen: Celtic -football

These operators can also be combined: "Victor Hugo" AND (novel OR poetry)

Searching repositories

Major data collections or repositories have their own dedicated search facilities.

Take the example of data.bnf.fr, a public database provided by the Bibliothèque nationale de France.

data.bnf.fr: a web interface to BNF data

data.bnf.fr: a web interface to BNF data

As you can see, this site provides its own search engine, through which you can apply several of the search techniques described above.

Because this database has its own public domain name, you can also search its contents using a search engine, again using the techniques described above and some further search options:

  • to locate items connected to a given keyword, e.g. “roman”, meaning “novel” in French, you can search with wildcards: roman* site:data.bnf.fr

This search will produce a set of terms connected to “roman”, e.g. “roman d’aventures”, “roman grec”, “roman à clefs”.

Repositories that you access through the Library will also have their own search engines. Note also that you can search across these from the Library homepage.

OneSearch: applying search techniques in the Library homepage

OneSearch: applying search techniques in the Library homepage

Searching the Creative Commons

You may on occasion wish to use relevant illustrations in your work.

Image search in the Creative Commons

Image search in the Creative Commons

The Creative Commons is a major open-access resource with its own dedicated search engine — note too its guidance on its own specialized search query syntax. Make sure to document your use of such sources by including captions with the relevant copyright information.

5 - Scripts and Unicode

How to work with scripts and with Unicode

5.1 - What is a script?

A script is the medium of writing in a given language — or languages

The written word is pervasive and scripts are the basis on which we communicate in writing. With the publication of Unicode 14.0.0 on 14 September 2021, the standard now supports 159 scripts.

This is how Unicode defines a script: “A collection of letters and other written signs used to represent textual information in one or more writing systems. For example, Russian is written with a subset of the Cyrillic script; Ukrainian is written with a different subset. The Japanese writing system uses several scripts” (Glossary of Unicode terms). Within this framework, the different scripts in the world, historical and contemporary, present wide variations.

About two-thirds of the writing systems in the world today use alphabetical scripts

Latin script

This is the script that you are likely to use most often: it is the one in which English and many European languages are written, and is the most widely script used today.

Latin script features

Latin script features

Scripts can be defined according to a number of characteristics, using Unicode or typographical terminology. Here are a number of these characteristics in the case of Latin script:

  • Latin script is alphabetical
  • it is bicameral, meaning that it has upper-case and lower-case characters, and is case-sensitive — so, we recognize brown to be an adjective and Brown to be a proper noun
  • it is a left-to-right script
  • it uses spaces as word-separators
  • Latin script uses hyphenation
  • Latin script uses what is termed a mid-baseline, with some characters having elements that descend below the base
  • Latin script has what are termed its own native digits, or numerals

You can see in the example above that the character d is used it its upper case form. The bounding boxes allow us to see that words are separated by spaces and that some characters, for example, d, h, k, b and f, have ascenders, or parts of the character that extend about what is termed the script’s x-height. Likewise, j, p, y and g have descenders. Note too that the interrogation mark extends about x-height.

Arabic script

Arabic script

Arabic script

Arabic scripts present several more distinguishing features than Latin script. After the Latin alphabet, it is the second most widely used script in the world.

  • Arabic is a right-to-left mid-baseline script
  • the script directly represents only consonants and long vowel sounds; in other words, it is an abjad
  • short vowel sounds and other phonetic information are denoted by diacritics
  • it is a cursive script; in other words, the characters “join up”
  • the shape of cursive characters can be determined by the characters to which they are joined
  • characters can also overlap
  • unlike Latin script, it is not case-sensitive
  • like Latin script, it has native digits
  • spaces are used as word-separators
An Arabic keyboard

An Arabic keyboard

Note the Arabic numerals on the second row from the top of this lower case keyboard.

CJK scripts

CJK scripts refer to Chinese, or Han, ideographs used in the writing systems of Chinese and Japanese, and to a more limited extent in Korean. Unicode supports more than eighty thousand Han characters.

Here is an example of a sentence using the Simplified Chinese script.

CJK: Simplified Chinese

CJK: Simplified Chinese

Can you identify the use of Traditional Chinese quotation marks here, as well as the European comma and full point?

  • Han scripts are ideographic, with characters usually representing a spoken syllable
  • for this reason, Han script is also referred to as a logosyllabary
  • Japanese script features both syllabic and ideographic-syllabic text, with word-spacing being used with the former
  • CJK scripts generally are left-to-right and can also be written vertically
  • they are not case-sensitive
  • Han does not use spaces as word-separators, though the justification of lines leads to adjustments in the placement of characters within their frames
  • Korean, by contrast, does feature spacing between words
  • both Han and Japanese script uses a centred baseline, whereas in Korean a bottom baseline is used
Han: case and boundaries

Han: case and boundaries

A Han ideogram can be thought of as contained a uniform square frame: here, the characters are displayed in visible bounding boxes to illustrate the absence of features like case and word separation. Note that punctuation marks imported from European scripts are full-width rather than half-width, and therefore do not require additional spacing.

A historical script: Ogham

Unicode extends also to historical scripts that are no longer current, one example being the medieval script of Ogham, which was widely used for inscriptions in Ireland, and also in parts of Britain. Here is an example of an Ogham inscription.

An inscription in Ogham

An inscription in Ogham

  • Ogham is an alphabetical script, with incisions corresponding to characters in the Latin alphabet
  • it is a left-to-right script, with a mid-baseline
  • many original Ogham inscriptions are vertical, reading from bottom to top, as in this example
  • Ogham inscriptions did not make use of word-spacing
  • Ogham forms a block in the Basic Multilingual Plane in Unicode
  • the Noto Project includes a font for Ogham
Inserting an Ogham character in LibreOffice

Inserting an Ogham character in LibreOffice

5.2 - What is Unicode?

With the development of Unicode, there is a unique code for each character in each language — or script

Unicode is a computer standard that provides a single code for each character in each language, or each script, bearing in mind that a given script can be used by more than one language. To quote the Unicode site, “Unicode provides a unique number for every character, no matter what the platform, program, or language is”.

Encoding scripts

All of the data associated with a given script are contained in character code charts maintained by the Unicode Consortium. These charts, then, represent the character set of the script in question.

Here are the characters contained in Basic Latin, which is the first block of the Latin script in Unicode.

Basic Latin block in Unicode

Basic Latin block in Unicode

This block contains the familar basic alphabet, and some punctuation and other characters. Note that this chart denotes each character’s code point in Unicode in hexadecimal form, a number system in base 16: 0–9, A–F.

Basic Latin begins at U+0020, which represents a space, and extends to U+007E, which is the tilde character.

Each and every character in Unicode has its unique hexadecimal reference in this form.

How is Unicode organized?

Unicode is made up of sixteen planes, with no characters assigned as of yet to planes 4 to 13.

Unicode planes in UnicodeChecker

Unicode planes in UnicodeChecker

The Basic Multilingual Plane contains almost all modern languages, including those using the Latin, Arabic and CJK scripts, and also a large number of symbols.

The Supplementary Multilingual Plane contains historical scripts, for example, Egyptian hieroglypics.

Egyptian hieroglyphic block in Unicode

Egyptian hieroglyphic block in UnicodeChecker

The Supplementary Ideographic Plane includes CJK ideographs that have been added to more recent versions of Unicode.

Supplementary Ideographic Plane

Supplementary Ideographic Plane in UnicodeChecker

Unicode blocks: the case of Latin script

Each plane in Unicode is made up of a number of blocks. Latin script, for instance, is made up of a total of 1286 characters in the Basic Multilingual Plane, divided into twelve blocks.

Block UniView Unicode
Basic Latin Character data Character chart
Latin-1 Supplement Character data Character chart
Latin Extended-A Character data Character chart
Latin Extended-B Character data Character chart
Latin Extended-C Character data Character chart
Latin Extended-D Character data Character chart
Latin Extended-E Character data Character chart
Latin Extended Additional Character data Character chart
Halfwidth and Fullwidth Forms Character data Character chart
IPA Extensions Character data Character chart
Phonetic Extensions Character data Character chart
Phonetic Extensions Supplement Character data Character chart
Unicode blocks in Latin script

 

Here are the characters that make up the second of these blocks, Latin-1 Supplement.

Latin-1 Supplement in the Basic Multilingual Plane

Latin-1 Supplement in the Basic Multilingual Plane

Unicode charts are now available in French as well as English, with the added possibility of searching for characters by name.

How computers handle characters

When you are working in a given application and select a character on a keyboard, you need to be sure that it will then appear on screen: you depend on a character encoding for this to happen.

The most common encoding used today is UTF-8. It is the standard encoding for the internet and is also the default in word-processing applications. When you enter a character in UTF-8, a computer will then transform that into binary code in bits, or minimal units of digital information (a 0 or a 1 in base 2 numbers).

For example, the character A in UTF-8 has the following value in binary code: 01000001.

Unicode and UTF code points in UnicodeChecker

Unicode and UTF code points in UnicodeChecker

Here, you can see the familiar hexadecimal reference for upper-case a in Unicode. UTF-8 is one of three transformation formats, alongside UTF-16 and UTF-32. They differ according to the number of bytes, or units of 8 bits, they use to encode a character. UTF-8 uses between one and four 8-bit bytes; UTF-16 uses two bytes (that is, 16 bits at a minimum) or four bytes; UTF-32 uses four bytes to encode each character (that is, 32 bits).

Code lengths in Unicode and in UTFs

Code lengths in Unicode and in UTFs

Here again are some data for the upper-case character a. You can see that there is one codepoint or code unit in each case, but the number of bytes varies from one to four, with UTF-8 being the most economical format for simple characters in Latin script. This is one reason why today it is by far the most widely used encoding in the internet.

IJ: code length

IJ: code length

By contrast, the upper-case ij digraph as used in Dutch represents a single code point and requires two bytes in UTF-8. An em-dash requires three bytes.

Em-dash: code length

Em-dash: code length

CJK characters in UTF-8 are as a rule longer than those in Latin scripts.

CJK character: code length

CJK character: code length

What, then, is the purpose of any of these transformation formats? It is to connect a Unicode reference-point to the purely binary encoding through which a computer operates.

Translating code points between Unicode and binary

Translating code points between Unicode and binary

It is in binary format that the characters you type are ultimately stored in a computer — in other words, in bits and in turn in bytes. This explains what the size of a file, and all of the data it may contain in the form of numbers, formulae or text, is likewise expressed in bytes (e.g. kilobytes, megabytes).

File size in kilobytes (k)

File size in kilobytes (k)

Transformation formats such as UTF-8 can represent any character in Unicode, which means that you can expect to be able to input any character in any script in the course of your work. The existence of a single standard and a universal encoding like UTF-8 therefore greatly simplifies your work. UTF-8 values are also much easier to handle that plain binary code would be.

5.3 - Unicode tools

Discover the range of tools that allow you to interact with Unicode

A number of tools exist through which you can explore the relationship between scripts, languages and characters, in particular the web-based resources developed by Richard Ishida.

You can also use applications directly in your computer. The Unicode Consortium has developed Unibook Character Browser as a resource for Windows; this also requires you to download and install related character property data.

An alternative application for Windows is BabelMap.

For macOS, UnicodeChecker is a utility that is comparable in scope.

Unihan data in UnicodeChecker

Unihan data in UnicodeChecker

For characters in the CJK blocks in Unicode, UnicodeChecker also summarizes information derived from the Unihan Database.

5.4 - Using the International Phonetic Alphabet

Learn how to carry out phonetic transcriptions

The International Phonetic Alphabetis an alphabetical system of phonetic notation. It is based on Latin script.

The IPA chart (source: r12a)

The IPA chart (source: r12a)

One convenient way to generate a phonetic transcription using the IPA is via an IPA picker: you select the symbols required and they then appear in a box at the top of the screen. Here is a transcription of the German word für. You can then, for example, copy and paste a transcription into a word-processing document.

When you do so, it’s important you ensure that the word-processor is equipped with a font that supports the IPA.

IPA-compatible fonts (source: r12a)

IPA-compatible fonts (source: r12a)

It is also possible to enter phonetic symbols directly in a word-processing application, again provided that a suitable font has been selected.

Phonetic transcriptions in Libre Office

Phonetic transcriptions in Libre Office

Libre Office lends itself particularly well to this task. You can see that the transcription of the German word “für” includes two symbols, the triangular colon and the small turned a character, that do not form part of the Basic Latin script. The IPA is, however, included in Unicode and when you need to input a symbol that is not to be found in the Basic Latin block, you can click on the omega icon in Libre Office in order to select the relevant character.

Selecting an IPA symbol in Libre Office

Selecting an IPA symbol in Libre Office

A panel will then open below the menu bar. Select the relevant Unicode block: in this case, IPA Extensions, and then you can insert the required symbol.

6 - Fonts

Explore fonts that support Unicode character sets

6.1 - Types and usage

Fonts come in many forms and these forms can shape how we communicate in print

Features of the Latin script that continue to shape how we communicate in print go back to the century that followed the invention of movable type in Europe and even further to ancient Rome.

The profile of a font

One of these ancient features is the presence of serifs, or projections at the extremes of certain strokes, as in the upper case e and in x here.

Profile of a serif font

Profile of a serif font (right click to view full size)

Each font has a number of distinctive elements, notably its x-height, or the height of lower-case characters, and its capital height. The full point size of a digital font is the distance between its descender and ascender lines (typography, needless to say, has its own distinctive terminology).

A font’s x-height has an important bearing on its legibility.

Contrasting x-heights

Contrasting x-heights

Fonts of quite similar point size can have x-heights that differ quite sharply. The raised x-height of Noto Serif means that it is more legible at smaller point sizes, provided that inter-line spacing is slightly increased.

Serif and sans serif

Many features of the design of serif fonts derive from calligraphy. Sans serif fonts, by contrast, are more geometric and less ornamental in character.

Serif and sans serif

Serif and sans serif

Even so, some sans serif fonts retain the contrast between thinner and thicker elements of a character: compare the shoulder of the letter h in the two forms of the Noto font, or the counter of the sans serif g. Note also that in the serif and sans serif forms of Noto the x-height, capital height, ascender and descender lines are identical. Today, many fonts exist in the form of font families encompassing serif and sans serif forms with many variants of style and weight. Noto Sans is the font used in this site.

Styles and weights

Italic fonts were first used in the sixteenth century to print material in Latin.

Styles and weights: roman, italic and bold

Styles and weights: roman, italic and bold

Nowadays, they are used for more specialized purposes, e.g. to emphasize individual words (especially in serif fonts) and to denote the titles of books and other full-length works, e.g. Le città invisibili, Le Mariage de Figaro. Word-processors sometimes present block quotations by default in italics, but this is not a style that you should follow in your essays; likewise, italics should not be used for inline quotations.

Bold fonts can also be used for emphasis and also to highlight the structural hierarchy of elements of a document, e.g. in headings. Note that here all three typefaces have the same point size, but differ in width.

OpenType fonts

OpenType is the most widely available font format available today. This is a file format that allows several different font variants to be combined in a single file — extending to features that we will now consider, including small capitals, ligatures and various forms of numerals.

Variations on the number 1

Variations on the number 1

Here are the several forms of the number 1 in the font Faustina, including forms used in fractions, lining and non-lining versions with tabular and proportional variations. It is possible to activate any OpenType forms you wish to use in a document using controls in a word-processor.

Small capitals and structural divisions

Small capitals, which are characters that take the form of capitals but are proportional in height and weight to lower-case characters, emerged also in the sixteenth-century, typically to designate headings or other structural elements.

Small caps

Small caps

As you can see here, they are marginally taller than the font’s x-height.

Roman and small caps

Roman and small caps

An unbroken series, as here, without any upper case characters is referred to as even small capitals and is a form often used in headings, and in page headers and footers. Note that this font also incorporates a form of the exclamation mark that is in proportion with the small capitals.

The legibility of small caps demands that what is termed the line’s tracking, in other words, the space between characters, is increased: hence the greater width of the second line.

Kerning

Even where the tracking in a given font may have been varied, some upper-case characters call for further adjustments. This form of adjustment is called kerning.

Top line: no kerning

Top line: no kerning

You can see here that the spacing between the first three characters in the top line looks out of proportion with the rest.

Kerning applied

Kerning applied

Here is the second of the two lines with each character enclosed within a bounding box. Kerning is a special form of adjustment between specific upper case characters in particular. As you can see, when kerning is applied the a and the v all notably overlap, making the string as a whole appear more evenly spaced.

Variable kerning

Variable kerning

In some cases, several characters in a row will require kerning, though to different degrees according to the specific combinations that may arise.

Digraphs and ligatures

In some scripts, a single printed character can be composed of two components. This feature is termed a digraph: here is an example of a character that represents a discrete diphthong in Dutch and belongs to the Latin Extended-A block in Unicode.

IJ in Dutch

IJ in Dutch: upper and lower case

In other words, this character is different from i and j printed side by side, which is why it has its own Unicode code point.

IJ: a composed character in Unicode

IJ: a composed character in Unicode

Another example is the character Dž and its lower-case equivalent dž, which are used in Croatian script. A further variant is the double capital form.

Latin capital DZ with caron

Latin capital DZ with caron

All of these three characters form part of the Latin Extended-B block.

Latin Extended-B in Unicode

Latin Extended-B in Unicode

See if you can identify them also in the Noto Serif character set.

A ligature is another form of joined character, but with a different purpose.

Ligatures

Ligatures

Today, what are termed standard ligatures are joined characters that avert otherwise awkward clashes, e.g. when an i or l follows an f. It is a good idea to activate these ligatures if you notice that clashes do arise in the font that you may be using.

Noto Serif: with and without ligatures

Noto Serif: with and without ligatures

In the case of Noto Serif, f and l are liable to clash, as you can see in the top line.

You can, however, opt for fonts which are designed in such a way as not to require ligatures, like Liberation Serif, which is the default open-source font in Libre Office.

Liberation Serif

Liberation Serif

OpenType fonts also often include historical or discretionary ligatures, which are mainly ornamental or antiquarian. These should be avoided in your essays: for the reader of today, they are something of a distraction. On the other hand, if you are reproducing the typography of an original source, you may wish to make use of them.

Ligatures

Shakespeare, Sonnet 130: historical st ligature

Numerals

OpenType fonts typically include numerals in different styles:

  • old-style (or non-lining) proportional numerals
  • old-style tabular numerals
  • lining proportional numerals
  • lining tabular numerals
Varieties of numerals

Varieties of numerals

Old-style numerals are appropriate to use in body text: they form an even line with alphabetical characters, so improving the legibility of your content. Tabular forms, whether lining or non-lining, are most useful in tables: columns of numbers then remain aligned from row to row. Lining numerals are as a rule slightly less than capital height. Where figures are to be combined with upper case characters, lining numerals are a better choice.

Accessing font features in word-processors

To access font controls in Word, select Format > Fonts, or use to keyboard to select CTRL+D (Windows) or CMD+D (macOS).

Fonts in Word

Fonts in Word

The basic controls are available in this interface. To access more specialized features, select Advanced.

Advanced font options in Word

Advanced font options in Word

Here, you can control kerning, where necessary. You can also access OpenType font features, including ligatures.

Numerals in Word

Numerals in Word

Here, proportional and non-lining or old-style numerals are selected.

The best option to control font features in LibreOffice is to use the Typography Toolbar.

The Typography Toolbar in LibreOffice

The Typography Toolbar in LibreOffice

Highlight a segment of text, or press CTRL+A or CMD+A to select all of the text in a document, and select the feature that you wish to apply.

6.2 - Installing fonts

How to select and install fonts for use in different kinds of work

Unicode fonts

The most practical resource to access scripts encoded in Unicode is the Noto Project. You can browse which fonts you would like to install. Noto Serif includes all of the various Latin blocks in Unicode and much more besides, as you can see from the full character set.

Charis SIL is also a useful font to install. Here is the full character set.

For access to a very wide range of symbols in Unicode, you can turn to the font Symbola.

Historical fonts

It is useful also to install fonts that include historical scripts that may no longer be in use, like Ogham, which is just one of a large number of scripts included in Clara.

Cardo, which describes itself as “is a large Unicode font specifically designed for the needs of classicists, Biblical scholars, medievalists, and linguists”, is also a useful resource.

What a font encompasses

Typically, you will download a font in the form of a zip file. Once you have uncompressed it, you will find the separate files for different typefaces: typically Regular (or Roman), Italic, Bold and Bold Italic.

Other fonts may contain a much larger range of weights.

Font weights in Noto Sans Simplified Chinese

Font weights in Noto Sans Simplified Chinese

You can choose to install all of these or simply the ones that you are most likely to find useful.

Installing fonts in Windows

  1. Unzip the compressed file containing the fonts.
  2. Right click on an individual font file and select Install.
  3. Repeat for further styles and weights.

Installing fonts in macOS

  1. Unzip the compressed file containing the fonts.
  2. Double click on an individual font file.
  3. When Font Book opens, select Install Font. As a rule, all of the styles and weights of the same font will be installed.

6.3 - Fonts and Unicode

How typefaces allow you to access the resources of Unicode

Fonts and scripts

Unicode has led to the development of a range of fonts that encompass not only the major scripts in use in the world today, but also historical and iconographic materials.

CJK fonts

Thus, the Noto CJK Simplified Chinese font provides support for the following major contemporary scripts:

Cyrillic, Han, Hangul, Hiragana, Katakana, Latin, Simplified Han, Traditional Han

This means in turn that it can encode content in the following languages:

Afrikaans, Albanian, Asu, Basque, Bemba, Bena, Bulgarian, Cantonese, Catalan, Chiga, Chinese, Chinese (Simplified), Cornish, Danish, Embu, English, Faroese, Filipino, Friulian, Galician, German, Gusii, Icelandic, Indonesian, Irish, Italian, Japanese, Kabuverdianu, Kalaallisut, Kalenjin, Kamba, Kikuyu, Kinyarwanda, Korean, Low German, Luo, Luxembourgish, Luyia, Machame, Makhuwa-Meetto, Makonde, Malagasy, Malay, Manx, Meru, Morisyen, North Ndebele, Norwegian Bokmål, Norwegian Nynorsk, Nyankole, Oromo, Portuguese, Romansh, Rombo, Rundi, Russian, Rwa, Samburu, Sango, Sangu, Scottish Gaelic, Sena, Shambala, Shona, Soga, Somali, Spanish, Swahili, Swedish, Swiss German, Taita, Teso, Vietnamese, Vunjo, Zulu

The CJK fonts were developed jointly with Adobe and the Source Han typefaces are the equivalent of the Noto versions. Like European serif fonts in particular, the development of Source Han is rooted in calligraphic traditions.

Introducing Source Han Serif, Adobe's open source Pan-CJK typeface
 

European scripts, ancient and modern

Clara, by contrast, encompasses a range of European scripts, ancient and modern, including Latin and Cyrillic, as well as the medieval Irish script known as Ogham. It can therefore support the following languages:

Afrikaans, Akan, Albanian, Asturian, Asu, Basque, Belarusian, Bemba, Bena, Bosnian, Breton, Bulgarian, Catalan, Central Atlas Tamazight, Chiga, Colognian, Cornish, Croatian, Czech, Danish, Duala, Dutch, Embu, English, Estonian, Ewe, Faroese, Filipino, Finnish, French, Friulian, Fulah, Galician, Ganda, German, Gusii, Hausa, Hawaiian, Hungarian, Icelandic, Inari Sami, Indonesian, Irish, Italian, Jola-Fonyi, Kabuverdianu, Kalaallisut, Kalenjin, Kamba, Kikuyu, Kinyarwanda, Lithuanian, Low German, Lower Sorbian, Luba-Katanga, Luo, Luxembourgish, Luyia, Macedonian, Machame, Makhuwa-Meetto, Makonde, Malagasy, Malay, Maltese, Manx, Meru, Morisyen, Nama, North Ndebele, Northern Sami, Norwegian Bokmål, Norwegian Nynorsk, Nuer, Nyankole, Oromo, Polish, Portuguese, Quechua, Romanian, Romansh, Rombo, Rundi, Russian, Rwa, Samburu, Sango, Sangu, Scottish Gaelic, Sena, Serbian, Shambala, Shona, Slovak, Slovenian, Soga, Somali, Spanish, Swahili, Swedish, Swiss German, Taita, Teso, Tongan, Turkish, Turkmen, Ukrainian, Upper Sorbian, Uzbek, Vunjo, Walser, Western Frisian, Wolof, Zulu

The scope of the font is considerably wider than many of the system fonts installed by default on computers. This makes it a useful choice for your work.

Phonetic transcription

A set of Unicode blocks that is fully supported in the Noto Sans and Serif fonts, Charis SIL and Clara are among those which cover the phonetic script of the International Phonetic Association.

An IPA transcription

An IPA transcription

You can use a dedicated keyboard where extended inputting in a specialized block is required.

Unicode block by block

Using LibreOffice, you can readily explore Unicode blocks contained in a given font. Consider first the case of Noto Serif. To view the characters encoded in a font, select the omega icon and the select More Characters.

Access to Unicode blocks in LibreOffice

Access to Unicode blocks in LibreOffice

You can then access individual blocks using a drop-down menu.

Unicode blocks in Noto Serif

Unicode blocks in Noto Serif

You can then see the characters that are available in a specific block.

IPA Extensions in Noto Serif

IPA Extensions in Noto Serif

In LibreOffice, you can input characters using this interface, which is a convenient option in isolated cases.

Symbols

Unicode extends also to typographical and many other symbols.

Symbola: musical symbols in Unicode

Symbola: musical symbols in Unicode

The font Symbola allows access to multilingual blocks and also to a very wide range of Unicode symbols.

Combining scripts

Dedicated Latin scripts were developed for use with Source Serif and Noto CJK fonts. This means that content in European and Asian languages can easily be combined using the same font.

Noto Serif and Sans Serif: Latin and CJK Simplified Chinese

Noto Serif and Sans Serif CJK: Latin and Simplified Chinese

Though Latin and CJK fonts use different baselines and other reference points, they are aligned in such a way as to ensure content in different scripts is integrated.

Latin and CJK scripts in Libre Office

Latin and CJK scripts in Libre Office

LibreOffice, among other word-processors, now has good support for the display and printing of characters in markedly different scripts.

Similarly, the use of a font like Noto Serif of Charis SIL or Clara makes it possible to combine Latin and IPA scripts, as we can see in the example above.

7 - Experiment with Markdown

Use Markdown as an alternative to a word-processor and discover markup

7.1 - What is Markdown?

Markdown is a lightweight and flexible form of plain-text markup

Markup is a means of denoting the structure of a document. Any markup document is made up of “content” and of “markup”, the latter consisting of simple tags and other identifiers that designate component parts of the content, for example, a heading, or a list.

Markup in web pages

A commonplace example is the use of markup to denote the structure of webpages.

<!DOCTYPE html>
<html>
<head>
<title>This is an HTML document</title>
</head>
<body>
<h1>This is a heading</h1>
<p>This is a paragraph.</p>
</body>
</html>

Today, the form of markup used for writing structured documents for the web is Hypertext Markup Language and the current standard is HTML5.

This markup language is made up of a series of tags. An HTML document is composed by default up of two main components, a <head>, which contains information about the document, including its <title>, and a <body>, which is made up of the content, for instance, headings and paragraphs, as shown above.

HTML5 also contains tags for all of the other expected elements of structured documents, such as ordered and unordered lists, inline and block quotations, tables, and links. It also contains tags like <img> and <video> to display audio-visual material, as well as markup for inline components, such as <em> or <strong>.

Note also that in HTML an opening tag is as a rule matched by a closing tag:

this word is <em>emphasized</em>

Markup and Markdown

Markdown is a format where text is encoded with minimal markup: hence Markdown

Markdown is designed for writing structured documents and makes use of a distinct but simple set of formatting conventions.

Markdown files are written in UTF-8.

Simple structured documents in Markdown

As a first step in generating structured documents, you can experiment with Markdown, ‘a plain text format for writing structured documents’, as it is defined on the CommonMark site.

Markdown

Content in Markdown rendered for display

Next, take the ten-minute CommonMark tutorial.

And then experiment in writing Markdown on the pandoc site, where you can experiment further by transforming one kind of markup into another.

Markdown in practice

So, you use minimal code to designate a heading and other elements:

# This is a level-one heading
This is a paragraph with a *link* to [CommonMark](https://commonmark.org/)

A paragraph, by contrast with a heading, doesn’t bear any specific code in Markdown, but is simply delimited by being followed by a blank line.

Single asterisks denote emphasis, or italics; double asterisks denote strong emphasis, or bold type.

Here is an unordered list:

- markup
- markdown

And here is a block quotation:

> En 1815, M. Charles-François-Bienvenu Myriel était évêque de Digne. C'était un vieillard d'environ soixante-quinze ans; il occupait le siège de Digne depuis 1806.

Advantages of Markdown

This practice is different from how a word-processor is normally used, where one typically applies a format to a given element of a document, e.g. by applying bold to a heading for display, rather than designating its structural function.

The advantage of a structured document, where, say, a heading or a list is designated as such, is that information about the structure can be re-used, just as much as the content: a single Markdown file can give rise to a PDF for printing, or to a webpage via an application like Hugo, with the structure being displayed as appropriate in each case.

This site is made up of Markdown files which are converted into webpages using Hugo.

Developing web content in Markdown

Developing web content in Markdown

By definition, then, Markdown is designed for writing structured documents and, like markup languages, makes use of a distinct set of formatting conventions. Because information on style is encoded, like content, in plain text, content is not subordinated to presentation, meaning that Markdown files are smaller in size.

Markdown files can incorporate also metadata, again in plain text.

File metadata in Markdown

File metadata in Markdown

The four elements of metadata included here are the document’s title, its author, a set of tags to denote its content, and an abstract, or summary.

A standard for Markdown

A variety of forms of Markdown exist, with work being undertaken at the moment to develop a common standard, namely CommonMark.

Markdown, like many of the tools you might use with it, is open source (subject to a Creative Commons license).

Why markup matters

Markup matters because it is widely used to store structured data, including a wide range of data used in language databases and in corpora.

EU legislation: a structured document in Croatian

EU legislation: a structured document in Croatian

In EU translation databases, texts like legal documents are encoded using XML markup, consisting of specialized tags, e.g. <p> for paragraph, or <head> for a heading that contains information about a document.

Collections of electronic texts are a further example of the use of markup. A local example is CELT, which is a collection of materials relating to Irish literary and historical culture in the languages represented on the island, including Irish, Latin, Anglo-Norman French and English. CELT makes use of specialized markup designed for use with just these kinds of sources.

As you work with specialized collections like these in the course of your studies, you will learn more about the different kinds of markup that allow large collections of linguistic data to be used in flexible ways. Experimenting with Markdown is a way of beginning to understand other forms of markup.

7.2 - Using Markdown

How to use Markdown in your work

From Markdown to PDF

An open-source editor like Visual Studio Code allows you to write and to edit documents in Markdown, and to preview the content as it would appear in print or on screen.

A Markdown file in Visual Studio Code with preview

A Markdown file in Visual Studio Code with preview

Other tools, like Marked2 also allow you to preview a file and to export it for printing as a PDF.

You can also apply stylesheets to determine the formatting of Markdown elements (e.g. headings, lists) in PDF files.

Further tools, like Markdown Plus, allow you to combine several of these functions (there are versions for Windows and Mac).

Markdown as a common denominator

There are other tools that use Markdown as a common standard from which files in other formats (including word-processor formats, like Word) can be generated, notably pandoc.

Markdown can also be used in conjunction with reference and research management tools, like Zotero, many of which are open source.

8 - Applications

A check-list of applications for work in languages

Word-processors

Notes

Bibliographical tools

Text editors

Markdown

Touch typing

Snippets

PDFs

Unicode

Phonetic transcription

Corpora tools

Image editors

Colour

Screenshots

Media players

Browsers and search engines

Utilities

9 - Useful reading

Where to go to find out more

Argument

Anthony Weston, A Rulebook for Arguments. Hackett, 1992.

Portable Document Format

Library of Congress, PDF is Here to Stay: Archiving with the Portable Document Format.

blogs.loc.gov/thesignal/2020/03/pdf-is-here-to-stay

Research

Thomas Mann, Oxford Guide to Library Research. Oxford University Press, 2005.

Scripts

British Library, A History of Writing.

www.bl.uk/history-of-writing

Richard Ishida, An Introduction to Writing Systems and Unicode.

r12a.github.io/scripts/tutorial/part2

ScriptSource.

scriptsource.org

Style guides

MHRA Style Guide.

MLA Style.

style.mla.org

Typography

Adobe, Glossary of Typographic Terms.

www.adobe.com/ie/products/type/adobe-type-references-tips/glossary.html

Matthew Butterick, Practical Typography.

practicaltypography.com

Paul Luna, Typography: A Very Short Introduction. Oxford University Press, 2018.

Microsoft, Microsoft Typography Documentation.

docs.microsoft.com/en-us/typography

Unicode

Unicode Consortium, What is Unicode?

www.unicode.org/standard/WhatIsUnicode.html

Unicode Consortium, Glossary of Unicode Terms.

www.unicode.org/glossary

10 - Checklist of resources

Here are the main resources mentioned in these pages

Word-processors

Microsoft Word support

LibreOffice help

Fonts

SIL Fonts

Google Noto Fonts

Text repositories

Project Gutenberg

CELT

Oxford Text Archive

Subject guides

Boole Library Subject Guides

Research resources

Boole Library: Databases A–Z

Open access

Boole Library Open Access

Directory of Open Access Journals

IMLR: Open resources for modern languages

Research resources and tools in languages

Clarin: resources

European Union: Language technologies

Sketch Engine

Voyant Tools

Markdown

CommonMark