Search Datasets
- datasetCLARIN-LT, 2024
English-Lithuanian parallel corpus DVITAS v2 includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. Version 1 of the corpus was compiled for the bilingual terminology extraction project DVITAS together with English-Lithuanian comparable corpus. The current 2nd version of the corpus features expansion of the 1st version containing additional 27 files and metadata information. The parallel corpus includes the EU legal acts and other documents from the time period of 2006-2022. The documents have been extracted from the EUR-Lex database and other EU institutional repositories. There are 107 aligned files in TMX format in English and Lithuanian, as well as 214 raw files (107 in English, and 107 in Lithuanian) within the dataset. The total size of the corpus is 1.97m words (EN-1.08m; LT-0.88m). The corpus contains 53,792 aligned segments.
4 - datasetOSF Registries, 2025
The key concepts: University’s third mission and lifelong learning are widely studied since their introduction in EU policy documents (European Parliament, 2000). The investigations have reached quantity that extensive systematic literature reviews were done on University’s third mission (for ex., Compagnucci, Spigarelli, 2020; Haj Taieb, 2024), lifelong learning (for ex., Kaplan, 2016; Thwe, Kálmán, 2024; Håkansson Lindqvist et al., 2024), lifelong learning also is researched in connection with older adults needs (Baumgartner, Jin & Kim, 2023). However, there is a lack of attention to the combination of all three concepts: University’s third mission, lifelong learning and older adults. Therefore, the scoping review was conducted using PRISMA for Scoping Reviews (PRISMA-ScR) (https://www.prisma-statement.org/scoping; Tricco et al., 2018) in combination with the methodology of Peters et al. (2022).
15 - datasetOSF Registries, 2024
The key concepts - critical thinking, psychological well-being, and teacher victimization - are extensively studied within the school context. Significant research has accumulated, with extensive systematic literature reviews conducted on critical thinking (e.g., Yildirim-Tasti & Yildirim, 2022; Boonsathirakul & Kerdsomboon, 2023), psychological well-being (e.g., Puertas Molero et al., 2019; Hascher & Waber, 2021; Berger et al., 2022; Dreer, 2023; Katsarou et al., 2023; Cann et al., 2024; Hamid Mukhlis et al., 2024; Fu & Zhang, 2024; Ozturk, Wigelsworth, & Squires, 2024), and teachers' victimization (e.g., Reddy et al., 2018; Chirico et al., 2021). Additionally, relevant scoping reviews and meta-analyses on these concepts can be noted (e.g., Longobardi et al., 2019; Ma et al., 2022; Alves et al., 2022; Maricuoiu et al., 2023). While numerous systematic reviews and meta-analyses exist on teachers' victimization, psychological well-being, and critical thinking, few studies explore the interplay among these three concepts. The scoping review aims to fill in this knowledge gap.
8 - datasetEuropean Organization for Nuclear Research, 2024
LLODIA (Linguistic Linked Open Data for Diachronic Analysis) model developed within the Nexus Linguarum WG4 UC4.2.1 use case in humanities.
18 - datasetEuropean Organization for Nuclear Research, 2023
This dataset contains ISBNs and below described metadata of 38,050 books which UK universities and research institutions submitted as their best research outputs to the UK's Research Excellence Framework (REF) in 2014 and 2021. The initial data was downloaded from the REF's webpages presenting the submissions and results for 2014 REF and 2021 REF. The thorough explanation how initial datasets were transformed into this dataset is provided in a preprint: Dagiene, Eleonora. 2023. "The Challenge of Assessing Academic Books: The UK and Lithuanian Cases Through the ISBN Lens." SocArXiv. https://doi.org/10.31235/osf.io/qpwxn This dataset contains ISBNs and their metadata, gathered from the Global Register of Publishers https://grp.isbn-international.org/.
32
- datasetVytauto Didžiojo universitetas, 2022
English-Lithuanian parallel corpus DVITAS includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. The corpus was compiled for the bilingual terminology extraction project together with English-Lithuanian comparable corpus. The parallel corpus includes the EU legal acts and other documents from the time period of 2006-2021. The documents have been extracted from the EUR-Lex database and other EU institutional repositories. There are 80 aligned files in TMX format in English and Lithuanian, as well as 160 raw files (80 in English, and 80 in Lithunian) in the dataset. The total size of the corpus is 1.4m words (EN-0.77m; LT-0.63m). The corpus contains 35,415 aligned segments.
74 - datasetVytauto Didžiojo universitetas, 2022
The English-Lithuanian comparable corpus (DVITAS COMPARABLE) is morphologically annotated. It includes English and Lithuanian original texts on cybersecurity from the time period of 2010-2021. The corpus was compiled for the bilingual terminology extraction project together with English-Lithuanian parallel corpus. There are 1,708 files in English and 2,567 for Lithuanian. The total size of the corpus is 4m words (EN-2m; LT-2m) The corpus is composed of texts representing 4 text types: academic (EN-19%; LT-30%), administrative-informative (EN-8%; LT-11%), legal (EN-18%; LT-4%), media (EN-55%; LT-55%).
53 - datasetCenter for Open Science, 2023
This is the registration for a scoping review with no prior registrations. The objective of this scoping review is to map existing evidence on the characteristics of child maltreatment pre-, during and immediately post-COVID-19 in Euro-CAN COST Action countries. The main aims of this scoping review are: 1) to identify both type and extent of available evidence and existing gaps, 2) to ascertain the need and feasibility of conducting a systematic review. The review follows the methodological framework by the Joanna Briggs Institute. It will be reported according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) extension for scoping reviews statement and the results will be disseminated will be disseminated through conference presentations and publication in a peer-reviewed journal.
52 - datasetThe challenge of assessing academic books: The UK and Lithuanian cases through the ISBN lens, 2023-03-29, 1 csv.
This dataset contains ISBNs and below described metadata of 5199 books which Lithuanian universities and research institutions submitted as their best research outputs to get state funding after annual research evaluations from 2008 to 2020. The thorough explanation how this dataset was created provided in a preprint: Dagiene, Eleonora. 2023. "The Challenge of Assessing Academic Books: The UK and Lithuanian Cases Through the ISBN Lens." SocArXiv. https://doi.org/10.31235/osf.io/qpwxn.
52 Lithuanian-English Cybersecurity Termbase v.0.1Dataset [Lietuvių-anglų kalbų kibernetinio saugumo terminų bazė v.0.1]datasetVytauto Didžiojo universitetas, 2023The bilingual termbase is TBX export of the online termbase https://www.terminologue.org/csterms/. The termbase includes terms for 233 cybersecurity concepts.
40
- datasetVytauto Didžiojo universitetas, 2022
English-Lithuanian parallel corpus DVITAS includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. The corpus was compiled for the bilingual terminology extraction project together with English-Lithuanian comparable corpus. The parallel corpus includes the EU legal acts and other documents from the time period of 2006-2021. The documents have been extracted from the EUR-Lex database and other EU institutional repositories. There are 80 aligned files in TMX format in English and Lithuanian, as well as 160 raw files (80 in English, and 80 in Lithunian) in the dataset. The total size of the corpus is 1.4m words (EN-0.77m; LT-0.63m). The corpus contains 35,415 aligned segments.
74 - datasetVytauto Didžiojo universitetas, 2018
274,460 word corpus comprised of selected primary and secondary law acts of the EU of the period 2015-2017. The corpus was compiled of documents containing words with the root "teis-" (en. law). All of the included documents were extracted from EUR-Lex database.
16 - datasetOSF Registries, 2025
The key concepts: University’s third mission and lifelong learning are widely studied since their introduction in EU policy documents (European Parliament, 2000). The investigations have reached quantity that extensive systematic literature reviews were done on University’s third mission (for ex., Compagnucci, Spigarelli, 2020; Haj Taieb, 2024), lifelong learning (for ex., Kaplan, 2016; Thwe, Kálmán, 2024; Håkansson Lindqvist et al., 2024), lifelong learning also is researched in connection with older adults needs (Baumgartner, Jin & Kim, 2023). However, there is a lack of attention to the combination of all three concepts: University’s third mission, lifelong learning and older adults. Therefore, the scoping review was conducted using PRISMA for Scoping Reviews (PRISMA-ScR) (https://www.prisma-statement.org/scoping; Tricco et al., 2018) in combination with the methodology of Peters et al. (2022).
15 - datasetEuropean Organization for Nuclear Research, 2023
This dataset contains ISBNs and below described metadata of 38,050 books which UK universities and research institutions submitted as their best research outputs to the UK's Research Excellence Framework (REF) in 2014 and 2021. The initial data was downloaded from the REF's webpages presenting the submissions and results for 2014 REF and 2021 REF. The thorough explanation how initial datasets were transformed into this dataset is provided in a preprint: Dagiene, Eleonora. 2023. "The Challenge of Assessing Academic Books: The UK and Lithuanian Cases Through the ISBN Lens." SocArXiv. https://doi.org/10.31235/osf.io/qpwxn This dataset contains ISBNs and their metadata, gathered from the Global Register of Publishers https://grp.isbn-international.org/.
32 TED-ELH Parallel CorpusDataset datasetMykolo Romerio universitetas, 2020The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data.
34