Getting Started with OSF Search
This Article Is Licensed Under CCO For Maximum Reuse.
The following is a Table of Contents that links directly to specific sections within the guide.
- Overview
- Navigating to the OSF Search page
- Searching By Term
- Interpreting Your Search Results
- Refine Results with Filters or Search Operators
- How Does OSF Search Determine Relevance
Overview
OSF Search is your destination for discovering all content types across the platform, including results for registrations, preprints, projects (and related components), files, and users in one easy to use interface. The OSF Search page can be used for searching, browsing, or both.
OSF Search does not include full text search of files uploaded to the platform. Any term entered on OSF Search will search through the metadata of all OSF objects including title, description/abstract, filename, contributors, license, identifiers, and registration responses.
Each search result includes key metadata to help you determine if these are the results you are looking for, as well as a "Context" section in each result that provides a preview of the metadata fields that reference your search term. These include the object title, contributors, description/abstract, and license.
Refine your search results or browse the entire OSF using specialized filters, including supporting funder, institutions, resource type, related materials, and more.
Finally, remember to make your own OSF research as discoverable as possible by adding robust metadata and making it public.
Navigating to the OSF Search page
Click "OSF Search" from the navbar on the left-hand side of any OSF page, or navigate to https://osf.io/search/.
Searching by Term
In the search bar, enter a search term and then click the question mark or hit the "Enter" key. Your search term can be as short, long, specific, or general as you like.
Search Results
The total number of results for your query is presented in the top left. If there are more than 10,000 results, it will only read "10,000+ results".
Modify your query, use filters, or browse by object type to reduce and specify your results.
The search result cards provide a preview of the content within, with the key metadata based on the object type.
Each card represents one of the following in OSF:
- Projects (or Project Component)
- Registrations (or Registration Component)
- Preprints
- Files
- Users
All result cards will have a "Context" section. This section will preview exactly where your search term appears as part of this object, including a snippet of this relevant section. Your search term will be italicized.
[placeholder for screenshot of context section on a results card]
View resource metadata
After you have refined your search you will see a list of results. Each will display the metadata most relevant to that object type, indicating what is included in that resource. To see a full expanded version of the metadata click the {V} arrow button to expand the available metadata.
Registration search results display supplemental information including connected open resources, the Registration provider, Registration template.
Refine Results with Filters or Search Operators
If your search results are not precise enough, you can use several different methods to refine the results.
Browse by OSF Object Types
Use the object type filters above the search results to limit your results to a specific OSF object type, or display all object types.
The available object types are: Projects (or Project Component), Registrations (or Registration Component), Preprints, Files, and Users
Using Filters
Filters allow users to refine their search results based terms present in the metadata of the results. These filters are dynamic in nature and will change the number of results and even add or reduce the other populating filters based on which items you select.
Each option within the filters displays the number of results that will be available if you make that selection. Add additional facets to find the specific works that you are interested in exploring.
Types of Filters:
Filter: | Definition: |
Date Created | Allows you to create ranges of time when the results were created and posted |
Funder | Filter by specific funders listed on the metadata of the result |
Subject | Filter based on the disciplines that you can choose for registrations and preprints |
License | Filter based on the license type of the results |
Resource Type (General) | An alternative way to filter based on resources in OSF metadata. See more here: Resource type in OSF |
Institution | Filter based on current OSF institution members. Only current paid members will populate |
Provider | Provider corresponds to service providers located within the OSF |
Is part of a collection | Filter for materials that are part of Collections on the OSF |
Registration Template | If you are searching for registrations you can filter by the template used during the creation |
Data | Filter by different data types |
Subject | Filter by subject type |
Includes Community Schema | Find OSF content with attached community metadata standards via CEDAR. See more here: CEDAR and data management |
How to search within a filter
Some filters contain a large number of values. In this case, not all values will be initially displayed, and may require a brief search in the facet dropdown. By default, the values within the filter that have the most relevant results will be listed first.
To find additional values, search for the specific value that you are looking for. Only relevant results will populate.
Using Search Operators
OSF uses a set of special characters to make more precise queries. Each is explained in more detail below:
- + signifies AND operation
- | signifies OR operation
- - negates a single term
- " wraps a number of terms to signify a phrase for searching
- * at the end of a term signifies a prefix query
- ( and ) indicates grouping of terms
- ~ after a term enables broader or “fuzzy” matching
To use one of these characters literally (e.g. searching for “B+ blood type”), escape it with a preceding backslash (\): B\+ blood type.
+ signifies AND operation
This will ensure both terms MUST be present in results, but may be anywhere in the document in any order (boolean “AND” operation). This is the default way any terms will be combined in OSF, so even if you do not include the "+" sign between terms, your query will only return results with all of the terms.
Example 1: research + reproducible
https://osf.io/search?q=research%20%2B%20reproducible&search=research%20%2B%20reproducible
Example 2: research reproducible
https://osf.io/search?q=research%20%2B%20reproducible&search=reproducible
| signifies OR operation
If you want to broaden a search to include either of two terms or phrases, boolean “OR” operation replace the “OR” with | (pipe or vertical bar). This will ensure either of the terms in the query will be present in any results.
Example 1: maternal | paternal
https://osf.io/search?q=research%20%2B%20reproducible&search=maternal%20%7C%20paternal
Example 2: positive | affirmative
https://osf.io/search?q=research%20%2B%20reproducible&search=positive%20%7C%20affirmative
" " wraps a number of terms to signify a phrase for searching
Putting your search terms in quotes ensures adjacent words are searched as a phrase and not as individual words. Searching “Maternal Depression” without quotes will return results where the database found those individual words anywhere in the source.
Example 1: “Maternal Depression” vs Maternal Depression
“Maternal Depression”
https://osf.io/search?q=%22maternal%20depression%22 ; 160 results all containing the phrase "Maternal Depression."
Maternal Depression
https://osf.io/search?q=maternal%20depression ; 486 results containing the terms "Maternal" and "Depression" anywhere in the metadata.
- Negates a single term
* at the end of a term searches for words that start with that term
The * (asterisk) operator is useful for finding multiple iterations of a word starting with the same letters. "CAT*" would return cat, cats, catholic, cathartic, etc.
Example: pos*
https://osf.io/search?search=pos*
~ after a word enables broader or "fuzzy" matching
This operator is useful when finding matches with words that are similar, but may have differences with character order or number (e.g. finding both color and colour or finding misspellings like accomodate and acommodate ). The operator can be combined with a number to indicate how many differences are acceptable.
Example 1: girafe~1
returns results containing words that match the term or are different by any single character, or single difference in the order of characters.
https://osf.io/search?search=girafe~1
( ) indicates grouping of terms
Combining terms within parentheses ( ) creates groups of terms that can then be combined. This is helpful when combining multiple complex concepts. Groups can be combined or nested in multiple ways
Example 1: ("climate change") | ("global warming")
returns results containing "climate change" or "global warming"
https://osf.io/search?search=(%22climate%20change%22)%20%7C%20(%22global%20warming%22)
Example 2: (("climate change") | ("global warming")) "systematic review"
returns results with "climate change" or "global warming" AND the phrase "systematic review"
How does OSF Search Determine Relevance
For blank searches of all of the OSF:
For specific searches using terminology:
For the result of a search query, the algorithm we use to determine relevance is the default provided by the tool the OSF uses for indexing, Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/reference/8.9/similarity.html). Specifically, that algorithm is called Okapi BM25 (https://en.wikipedia.org/wiki/Okapi_BM25). Simplistically, the algorithm looks at all the words in the text (in this case the metadata record) and compares the total number of times the query words appear. That general idea is used in lots of ranking algorithms and is called TF-IDF (term frequency-inverse document frequency) https://en.wikipedia.org/wiki/Tf%E2%80%93idf
Example: if the query term appears 10 times in a 100-word record that would be a higher frequency compared to a record where the term appears 25 times in a 1,000,000-word document. So while the term appears more frequently, it occurs less often compared to the size of the text
How does the OSF determine
This Article Is Licensed Under CCO For Maximum Reuse.