BASE in General
What is BASE?
BASE (Bielefeld Academic Search Engine) is one of the world's most voluminous search sengines especially for academic web resources, for example journal articles, preprints, digital collections, images / videos or research data. Try searching with BASE right now!
What is Different about BASE?
BASE facilitates effective and targeted searches and retrieves high quality, academically relevant results. Unlike search engines like Google or Bing, BASE searches the deep web as well. The content providers which are included in BASE are intellectually selected (by people from the BASE team) and reviewed. That's why data garbage and spam do not occur. Read more details about the project.
The People behind BASE
BASE represents a project of Bielefeld University Library: The BASE Team.
BASE Future Developments
BASE is a strategic project of Bielefeld University and is in a state of constant development.
Will privacy be guaranteed in BASE?
All accumulating data which arise while using BASE, are stored exclusively on servers at Bielefeld University and not passed on to third parties. Statistical evaluation is strictly carried out anonymously. For further information see the legal notice.
Which indexing software is used for BASE?
Since May 2011 we are using the open source search technology of Solr/Lucene. Until May 2011 the search engine technology of Microsoft FAST (Fast Search And Transfer) was used.
Get in Touch With Us!
We highly appreciate your comments and feedback. Leave us a message or send a tweet to @BASEsearch.
Which criteria does a content provider / source have to meet to be added to the BASE index?
There are three criterias:
- The source contains academic content only
- At least some documents from the source are available as open access (full texts free of charge, without registration)
- The metadata of the docments are provided via a valid OAI-PMH interface (learn more about this in our Golden Rules for Repository Managers)
We also regularly examine repository directories such as OpenArchives, ROAR, and OpenDOAR, or the lists of appropriate software installations such as DSpace or OJS, and the Crossref publishing platform, and we are harvesting and indexing the content of appropriate sources.
How can I recommend a new content provider / source for indexing?
Go to the page add content provider and check, if the content of your source is already indexed in BASE and (if it's not indexed) it fits our criterias. After that you can fill out a form and suggest your source to us. We also observe several repository directories like OpenArchives, ROAR and OpenDOAR, repository software directories like DSpace or OJS or the publisher's plattform Crossref regularly and index the content of suitable sources.
Can I upload a document directly to BASE?
No, it's not possible to upload documents to BASE. BASE is not a subject database or a publication management system, but a search engine. To get your publication indexed by BASE you have to upload it to a content provider (university repository or general repository like Zenodo) or to a journal, which is indexed by BASE (see our list of indexed content providers).
How do I set up an OAI interface, so that my source can be indexed by BASE?
Repository software like DSpace, Eprints oder OJS (for journals) provide an OAI interface by default. Sometimes it has to be activated or configured. Check out our golden rules for repository managers. They might be helpful to optimize your OAI interface. You can also set up an OAI interface on your own. Look for the implementation guideline at the Open Archives Initiative's website. You can find more general information about OAI at OpenArchives.org and Wikipedia. With our OAI validator OVAL you can easily verify if your source is compliant with the BASE requirements.
My source does not provide an OAI interface. Is it possible to index the documents in BASE nevertheless?
In case your source does not provide an OAI interface and you are not able to set up one, upload your documents to aggregators like DataCite or Zenodo, to subject repositories like RePEC or add your open access journal to DOAJ. We are indexing these content providers regularly. However, the best way to get your documents indexed by BASE is to provide an OAI interface – in this case we can assure fast and smooth indexing of your source and data from your source will be presented completely and in the best possible way.
Does indexing involves any costs?
No, all services (including indexing of content providers) in BASE are provided without any costs.
Updating / Deleting Content
How often do you update the content of indexed content providers?
We are updating all indexed content providers twice a month. In larger intervals all content is completely re-harvested and re-indexed.
My journal is already indexed in BASE. Do I have to send you a message when a new issue is published?
No, the content of all indexed content providers is automatically updated regularly. If an article published in a source, which is already indexed in BASE, won't be indexed after 6 weeks there's usually an issue with the OAI interface of the content provider. In this case please send us a message with detailed information (at least name and URL of the content provider).
The name of my source in the field "Content provider" is not correct. Can you correct this?
If there is a spelling mistake in the name of a content provider or the name of a content provider has changed, you are welcome to report this to us simply via our contact form and we will correct the name as soon as possible.
Please note: In the field "Content provider" you will always find the name of the provider from which we have indexed a document. For various reasons, this may differ from the actual name of your source (for example the name of your journal or repository). It is possible that the documents were indexed via an aggregator (for example DataCite) or via an e-journal platform. In this case, the name of the aggregator / platform is given and the name of the content provider cannot be changed. For journals, the name of the journal should therefore always be specified in the dc:source field of the OAI interface. This information will be shown in the field "Source" in a BASE hit list.
Why is my source indexed incorrectly / incompletely?
It can take up to 6 weeks until a record / document, which is added to the OAI interface of a content provider, is indexed in BASE. If the content of your source is not updated for a longer time period, this is generally related to an issue with the OAI interface of the content provider. It might also be the case, that records of the content provider are provided incorrectly or not at all via the OAI interface of the content provider. The documents may be provided correctly / completely via the web interface of the content provider, but as we index OAI-metadata only, the web interface is not relevant for us. You can report errors via our contact form. If you are the manager of a content provider, please check the compliancy using our OAI validator OVAL and mind our golden rules for repository managers.
Some documents are indexed incorrectly. Where can I report this issue?
If you stumble upon an error in a record in the BASE hit list, for example wrong or missing author names, titles, years or bugs in the character set (for example ? instead of a character) the content providers are usually responsible for this. BASE is a search engine. This means, we index the documents / records the way, they are provided by the content providers. We already correct obvious metadata errors with automated procedures during indexing, but a substantive check is impossible. If some documents are indexed incorrectly, they usually have to be corrected at the content provider. If the metadata is corrected in the OAI interface of the content provider (and the "datestamp" of the record is updated as well) the records should be updated in BASE, too, because we update all indexed content providers regulary. It can take up to 6 weeks until a record / document, which is corrected in the OAI interface of a content provider, is updated in BASE.
I published a document in a source which is indexed in BASE, but I can't find my document in BASE. Why?
It takes some time until new documents are indexed in BASE, too. If your document won't be indexed after 6 weeks there's usually an issue with the OAI interface of the content provider. In this case please send us a message with detailed information (at least name and URL of content provider, title and URL of document).
I have made changes to documents in my source. Why are these changes not updated in BASE?
Any subsequent change to a record (document) must be marked in your OAI interface by updating the record's datestamp. All indexed content providers are regularly updated in BASE. If the "datestamp" is not updated an update in the BASE index is not possible and the document remains unchanged and therefore incorrect in the index until we re-index the content provider completely, which is only done in larger intervals.
I would like to have a document removed from the index. Is this possible?
BASE is a search engine. We only index documents that are publicly provided by content providers. We cannot delete records directly from our index, because they would be indexed again the next time a source is indexed. If you think that a record should be deleted from BASE, you must contact the content provider and ask them to remove the document from their source. Please note that documents are also distributed via agregators, this means a document may be referenced in different sources and is indexed in BASE multiple times accordingly. In order for the document to no longer appear in the BASE index at all, it must be removed from all content providers. Please also note that there are sometimes legal regulations that prevent documents from being arbitrarily deleted or censored.
I have deleted a document from my source. Why is it not deleted from BASE?
If a document is deleted from your source, the record in the OAI interface must be marked as "deleted". Under no circumstances may the record be completely deleted from the OAI interface. If a document is not marked as "deleted" (but is completely removed from the OAI interface instead) the document remains unchanged and therefore incorrect in the index until the content from the content provider is completely re-indexed, which is only done in larger intervals.
The percentage of OA documents of a content provider is not correct. Why are open access documents from my source not marked as open access in BASE?
In our list or content providers we show the percantage of OA documents for every indexed content provider. There is a three-step process of how a document indexed in BASE can get the open access status:
- The content provider provides open access documents only. This will be recorded (if known) in our administration database and all indexed documents from this content provider are marked open access in BASE
- The metadata of an open access document contains a special set for open access documents in <record><header> in "setSpec" (for example "driver", "openaire" or "OpenAccess"). This set is entered into our database and all documents from this set are marked as open access.
- Open access documents are individually marked by the content provider in <dc:rights> (either CC license or descriptive information such as "OpenAccess" etc.)
If none of these three details are available, BASE cannot identify the document as open access. If there's an issue with this number, check your OAI interface and improve your metadata, if possible (see our Golden rules for details) or leave us a message.
My source / document has apparently been removed from BASE. Why that?
We check the servers of all indexed content providers regularly. When a server / OAI interface does not work at all or is working very unreliably over a longer period of time, this server is temporarily or permanently removed from the index. This is also done, if a content provider does no longer offer Open Access content. It can also happen that individual documents are no longer made available via the OAI interface from a content provider. These documents are then also removed from BASE when the content is updated. In this case, check whether the OAI interface of a content provider works correctly and whether the document is still contained in the OAI interface of the content provider. If this is the case, you can send us a message using the contact form and ask us to check the content provider again.
A source / publisher seems to be "pradatory" / "fake science" or seems to offer advertising / spam. Can this source be deleted?
We are aware that predatory publishing is a huge problem and we try to check, if for example a journal is predatory, before (not) indexing it, but this is very difficult as we can only perform a short check – often just a technical check of the content provider in general – and we can't check the content for example of a journal in detail. BASE only indexes sources which are providing a working OAI interface and this is often a technical barrier for these kind of journals. However some articles from these journals are indexed via general ("open for all") content providers or aggregators. Only the content provider / aggregator can delete this content from his source.
Unfortunately, it can also happen that a content provider (for example a journal platform) discontinues his service and sells the domain. Then it can happen that the new operator offers content that has nothing to do with the content we have indexed originally.
As soon as we notice these kind of problems, the content provider is deleted immediately from our index. You are welcome to help us by reporting such content providers via our contact form.
Searching / Result List
How do I search BASE?
See our search help.
Do you offer a full text search in the indexed documents?
Due to time and performance constraints we are indexing only metadata (title, author names, abstract …) of documents. Thus it's not possible to search the full text of the indexed documents.
Can I narrow a search on open access content only?
Use the advanced search and the criteria "Access" to narrow the search on open access documents or exclude documents that are not freely accessible. You can also narrow directly on CC (Creative Commons) licenses. CC licenses allow content to be distributed – there are several sub-licenses, depending on whether commercial retransmission is allowed or retransmission must also be under a CC license. For details see for example Wikipedia.
After performing a search you can narrow your search on open access documents only. To do so, click on "Access" in the "Refine Search Result" box and on "Open Access". The result list will be narrowed on documents, which are clearly marked as open access documents by the data provider in their metadata. Keep in mind that we can only identify 35-40% of all indexed documents as open access (a total of circa 60% of the indexed documents are freely accessible – this means 20-25% of the documents are open access but could not be marked as such because of lack of information).
Why are open access documents in BASE not marked as open access?
There is a three-step process of how a document indexed in BASE gets an open access status and is marked with an icon (for details, see above). If none of these three details are available, BASE cannot identify the document as open access.
Why can't I access the full text of a document?
About 60% of the indexed documents in BASE are open access, the rest are mere metadata entries without full text or can only be accessed, if you are authorized for accessing this particular website. You can search the metadata of all indexed documents. The authorization is always done by the content provider. If you don't have access to a full text although your institution supposedly is authorized, please contact your IT department or the content provider.
Why can't I access a document marked as "Open Access"?
If a document is marked as "Open Access" but still not accessible, there is usually an error on the part of the content provider. We mark documents as "Open Access" according to the information received from the content provider (for example corresponding details in the "rights" information or a CC license). We cannot check individually whether this information is correct. If you come across a document that is not correctly labelled as "Open Access", please contact the content provider and point out the problem.
Why do I always end up with an error message when I try to access a document?
If you get an error ("page not found"), the web address (URL) of the document might have changed or the document was deleted since we indexed it recently. Though content from academic sources should provide permanent addresses and changes or deletion of documents should be communicated via the OAI interface, in practice it's often not the case. Therefore it might happen that links to documents which appear on our result list do not work. Another reason for an error might be, that the server of the content provider is temporarily or permanently not available. If you encounter an error, please leave us a message. We will contact the content provider or remove it from our index, if it's a permanent problem.
What are metadata?
Especially in an academic environment you will often come across documents containing metadata. These are descriptive elements assigned to a document in order to specify it both in technical respect and in terms of content. Metadata are for example author's names, publication dates, abstracts, language or – in case of a journal – details regarding the title or the issue. Most of the documents which we index contain metadata, therefore you can search purposefully for authors or publication years in the advanced search or narrow search results. "Normal" websites generally do not have metadata and this is why a precise search for authors or publication years are not possible or only possible to a very limited extent in internet search engines such as Google or Bing.
The metadata of a record is broken. Can you correct it?
If you find an error in the metadata, for example wrong or missing author names, titles, years or bugs in the character set (for example ? instead of a character) the content providers are responsible for this. We already correct obvious metadata errors with automated procedures during indexing, but a substantive check is impossible. It's best to contact the content provider directly if you find errors in a record.
If the record appears correctly at the content provider, it may be that the metadata is delivered incorrectly via the OAI interface, which we use to index the records. It may also be that the content provider has recently corrected the record. In this case, the record in the BASE results list is also corrected during the next update (usually within 2-3 weeks).
What does the link "claim" after the author name mean (Add ORCID ID)?
See information about this topic in the next section.
What is ORCID?
ORCID (Open Researcher and Contributor ID) is a standard that is established worldwide for unambiguously matching academic authors to their respective publications. The ORCID iD as persistent identifier allows, for example in case of identical names, name changes or name variants, the clear differentiation of author names. It can be used anywhere in the world and depicts affiliation changes throughout individual academic careers. Thus, it contributes to better visibility of authors and their publications, and more and more publishers and funding organisations are demanding that authors provide an ORCID iD.
- How can I "claim" publications in BASE?
If you are the author of a publication that is listed in a BASE results list, you can confirm the authorship for this publication via the link "claim". After "claiming", an icon ("iD") with a link to your profile at ORCID is displayed next to your your author name in BASE.
For this you must register once with our search engine (see personal login) and you must have an ORCID iD. See our Quick Guide Claiming of Publications with ORCID iD (text in German only, but screenshots will show you the way how to do it).
If you do not have an ORCID iD yet, you can register for free at ORCID after clicking on "claim". After registering, ORCID will provide you with an ID that allows you to uniquely identify your publications (even if the name is identical to another person, variants in the spelling of your name or name changes). You can also incorporate metadata of the publication (author, title, year of publication, etc.) directly into your publication list at ORCID.
I want to undo the "Claiming". How can I do this?
If you have assigned publications of another person to your ORCID iD or if you want to undo the "Claiming" due to another reason you can delete the assignment. Login to your personal account in BASE and click on "My Publications". Here you can find all publications you have claimed in BASE. Click on the link "Delete from your publication list in BASE" or "Delete from your publication list in BASE and ORCID" to disconnect the assignment. The icon "iD" next to the author's name in the publication you have claimed accidentally will be deleted and the link "claim" will be shown again.
My publications have been assigned to the wrong person. Can this be corrected?
If someone else has incorrectly "claimed" your publications as his or her own, please let us know about this issue via our contact form. Click on "Detail View" to get the URL to this record and copy this URL in the contact form, so we can determine the problem. If several or all of your publications have been claimed incorrectly please write a comment in the contact form and give us an example.
Can I "claim" publications for several authors from my institution, publishing house, journal via one account?
The link between BASE and ORCID is always based on the personal ORCID account of the respective author. Therefore, you cannot claim publications for authors from your institution, publishing house or journal via an administrative account. We regularly check the "claimed" entries in BASE. If we find such entries, we will remove them.
In order to claim publications for different authors, you need to know the personal login of the respective author in ORCID. To do so, perform the "claiming" – for each author separately – according to our Short Guide to Claiming of Publications with ORCID iD (text in German only, but screenshots will show you the way how to do it). If you manage several ORCID iDs, make sure to link the respective author with the correct ORICD profile.
What is ORCID’s Collect & Connect Program, and what are the badges that BASE was awarded in this program?
Collect & Connect is ORCID's integration and engagement program, which sets out best practices to clarify implementation goals and expectations across sectors, standardize the user experience, improve understanding of trust in connections between ORCID and other identifiers, increase the efficiency and quality of integrations, and help achieve ORCID's vision through a community approach.
In the context of ORCID's Collect & Connect program, BASE was awarded the following badges:
BASE authenticates ORCID iDs to ensure researchers are correctly identified in system.
BASE displays iDs to signal to researchers that it support the use of ORCID.
BASE connects information to ORCID records for trusted sharing with others.
Why do I always end up on the German-language pages?
The BASE web pages are presented in the language, which is preselected in your browser settings. These settings can easily be changed (for example, if you use the Mozilla "Firefox" browser, choose "Preferences" and then "Settings"). Switch to "English" as preferred language and the BASE pages will be presented in English immediately.
Is there a print version of the BASE web pages available?
The BASE web pages are designed to automatically change into a printer optimized version when the printing command is released.
Is there an optimized version of the BASE web pages for mobile devices (smartphones etc.) available?
The BASE pages are designed responsively and therefore always optimally displayed on a large monitor as well as on a tablet or smartphone.
Meet the BASE pages web standards?
The BASE web pages are properly designed to comply with all kinds of browsers and operating systems without any restrictions. The pages are created according to web standards (XHTML, CSS). Older browsers which do not support the latest web standards get a text version without special layout. Great importance was attached to ensuring that the requirements for barrier-free websites according to the BITV (barrier-free information technology regulation) are fulfilled as completely as possible.