Europe PMC is the database of life sciences abstracts and full text articles that incorporates both PubMed and PMC content, as well as life sciences preprints. Europe PMC is free to use, is updated daily (currently holding ~45 million abstracts and ~10 million full text articles), and is supported by 35 funders of life sciences research.
In addition to providing powerful search and retrieval mechanisms for the content, we also collaborate with many organisations to integrate the articles with ORCIDs, supporting data, funding information and other resources that provide relevant information for our users. Finally, we engage with the text mining community to maximise exposure and reuse of their work to improve search and browse systems.
The role
We require a Data Analyst to join the Literature services team here at EMBL-EBI. In this role you will will interact with the team (Team Leader, Developers, Product team, Community Manager and Content Specialists) in an Agile environment to help us ensure data quality (structured metadata of literature content and cross linking of literature to other data using persistent identifiers, gathered from multiple sources) across all of our services.
Through analysis of those data we expect to recognise trends in scientific publishing, emerging standards, and potential new sources of data linking. This role will assist us to ensure that we are responsive to innovations in publishing, both scientific and technical, enable us to extend our collaboration network, and improve services for our users.
We are therefore looking for a versatile data/ research analyst to elicit stories from the data in Europe PMC and translate these insights into potential actions. As well as having the technical skills to manipulate large datasets, you will need to have a curiosity-driven mode of operation and a desire to share the insights discovered with both the technical team, in order to improve the user experience, and colleagues more broadly in the field, in order to extend collaborative activities.
- Scope new sources of content and linked metadata, translating the opportunities into specifications for technical staff. Will involve working with new and varied data in both structured and unstructured forms
- Make recommendations on data management, quality improvements, or development opportunities to the technical team
- Provide analytics on incoming content and related linked scientific outputs regarding the uptake and implementation of new publishing trends and standards e.g. data and software citation, linked persistent identifiers such as ORCID, ROR, Accessions and DOIs
- Provide content-based analysis to inform decisions and for reporting and publishing purposes
- With the product team, analyse web analytics and gather insights on user behaviour on the website
- Develop dashboards to assist Europe PMC Funders and Europe PMC team leader to monitor Open Science trends through the integrated data sources available
- Work with EMBL’s Office for Scientific Information Management and Strategy teams to provide solutions for their data requirements
- Periodically write reports/papers on the latest activities of the group
- Outreach to aligned projects and resources in the scholarly communications ecosystem and represent Europe PMC at workshops and meetings, as appropriate
You have
- Undergraduate degree in a related field, or an equivalent qualification or professional experience
- Experience analysing the results of large datasets, including producing plots with Python or R
- Willingness to contribute to pipeline development in step with the standard procedures and timelines of the wider Europe PMC project
- Willingness to work as part of a collaborative team to identify issues, discuss solutions and refine details to reach our high-level requirements and goals
- Excellent communication and presentation skills
You might also have
- Good understanding of Open Science policies and what data science can achieve to support monitoring of adherence to new policies
- Ability to use collaborative software environments like Git
- Experience with Data Analysis using PostgreSQL and SQL
- Experience with linked data, RDF, MongoDB, Graph databases (e.g.: Neo4J), Triple store technologies
- Experience using open source data analytics tools, such as Matomo
- Experience working in an open source environment
Benefits
- Financial incentives: Monthly family, child and non-resident allowances, annual salary review, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances
- Flexible working arrangements, including spending at least 2 days in the office per week
- Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover)
- Generous time off: 30 days annual leave per year, in addition to eight bank holidays
- Relocation package including installation grant (if required)
- Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely)
- Family benefits: On-site nursery, 10 days of child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances
- Benefits for non-UK residents: Visa exemption, education grant for private schooling, financial support to travel back to your home country every second year and a monthly non-resident allowance.