Search Datasets
- datasetEuropean Organization for Nuclear Research, 2024
LLODIA (Linguistic Linked Open Data for Diachronic Analysis) model developed within the Nexus Linguarum WG4 UC4.2.1 use case in humanities.
9 - datasetEuropean Organization for Nuclear Research, 2023
This dataset contains ISBNs and below described metadata of 38,050 books which UK universities and research institutions submitted as their best research outputs to the UK's Research Excellence Framework (REF) in 2014 and 2021. The initial data was downloaded from the REF's webpages presenting the submissions and results for 2014 REF and 2021 REF. The thorough explanation how initial datasets were transformed into this dataset is provided in a preprint: Dagiene, Eleonora. 2023. "The Challenge of Assessing Academic Books: The UK and Lithuanian Cases Through the ISBN Lens." SocArXiv. https://doi.org/10.31235/osf.io/qpwxn This dataset contains ISBNs and their metadata, gathered from the Global Register of Publishers https://grp.isbn-international.org/.
23 - datasetMykolo Romerio universitetas, 2024-10-04
The data is provided in two files: one containing questionnaire-data and the other containing the respondentents' data. The questionnaire data is in a TXT file, which includes the survey questions and possible responses. The respondents' data is in a TSV file with 26 columns, detailing anonymised respondent information and their answers: Timestamp, Agreement, Region, Age, Education, Profession, Preference of the 1st term, Reasons, ..., Preference of 10th term, Reasons. A total of 593 respondents participated, representing diverse age groups, regions, and levels of expertise. Participants were asked to choose the most appropriate Lithuanian terms for 10 cybersecurity concepts (cyberattack, spam, denial-of-service attack, man-in-the-middle attack, brute force attack, phishing, botnet, hacker, honeypot method, zero-day vulnerability). They could either select term provided in the questionnaire or suggest their own, giving reasons for their selections. The dataset facilitates research into terminology preferences, revealing which types of terms are preferred by users (borrowings, metaphorical calques, or descriptive terms) and how preferences vary across two respondent groups: students versus graduates and cybersecurity experts versus the general public. Additionally, data on respondents' reasoning revealed key factors in determining term suitability. The self-suggested terms underscore respondents' creative potential and their strong interest in maintaining national terminology. Rackevičienė, Sigita & Utka, Andrius (2024) "Preferences of Lithuanian cybersecurity synonymous terms in different user groups." Kalbų studijos 44, 107-122.
6 - datasetOpen Science Framework, 2024
It is a scoping review protocol aiming to fill in the gap of knowledge synthesis about technostress of 50+ academia. There are reviews on technostress, technostress of older workers, but little attention is paid to specific older workers group. Therefore, this scoping review aims to fill in the gap and to synthesize available literature on technostress in academia 50+. Three research questions are formulated: What kind of technological tools (i.e., digital vs. analog) are associated with distress among academics? What types of health and educational challenges do academics experience as a result of their use of various technologies? How did the academics deal with the difficulties caused by technology?
21
- datasetVytauto Didžiojo universitetas, 2022
English-Lithuanian parallel corpus DVITAS includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. The corpus was compiled for the bilingual terminology extraction project together with English-Lithuanian comparable corpus. The parallel corpus includes the EU legal acts and other documents from the time period of 2006-2021. The documents have been extracted from the EUR-Lex database and other EU institutional repositories. There are 80 aligned files in TMX format in English and Lithuanian, as well as 160 raw files (80 in English, and 80 in Lithunian) in the dataset. The total size of the corpus is 1.4m words (EN-0.77m; LT-0.63m). The corpus contains 35,415 aligned segments.
58 - datasetVytauto Didžiojo universitetas, 2018
274,460 word corpus comprised of selected primary and secondary law acts of the EU of the period 2015-2017. The corpus was compiled of documents containing words with the root "teis-" (en. law). All of the included documents were extracted from EUR-Lex database.
8 - datasetEuropean Organization for Nuclear Research, 2023
This dataset contains ISBNs and below described metadata of 38,050 books which UK universities and research institutions submitted as their best research outputs to the UK's Research Excellence Framework (REF) in 2014 and 2021. The initial data was downloaded from the REF's webpages presenting the submissions and results for 2014 REF and 2021 REF. The thorough explanation how initial datasets were transformed into this dataset is provided in a preprint: Dagiene, Eleonora. 2023. "The Challenge of Assessing Academic Books: The UK and Lithuanian Cases Through the ISBN Lens." SocArXiv. https://doi.org/10.31235/osf.io/qpwxn This dataset contains ISBNs and their metadata, gathered from the Global Register of Publishers https://grp.isbn-international.org/.
23 Dataset TED-ELH Parallel CorpusdatasetMykolo Romerio universitetas, 2020The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data.
27