Getting Started with OSF Search


This Article Is Licensed Under CCO For Maximum Reuse.

cc-zero.png


The following is a Table of Contents that links directly to specific sections within the guide.


Overview

OSF Search is your destination for discovering all content types across the platform, including results for registrations, preprints, projects (and related components), files, and users in one easy to use interface. The OSF Search page can be used for searching, browsing, or both.


OSF Search does not include full text search of files uploaded to the platform. Any term entered on OSF Search will search through the metadata of all OSF objects including title, description/abstract, filename, contributors, license, identifiers, and registration responses. 


Each search result includes key metadata to help you determine if these are the results you are looking for, as well as a "Context" section in each result that provides a preview of the metadata fields that reference your search term. These include the object title, contributors, description/abstract, and license.


Refine your search results or browse the entire OSF using specialized filters, including supporting funder, institutions, resource type, related materials, and more.


Finally, remember to make your own OSF research as discoverable as possible by adding robust metadata and making it public.


Navigating to the OSF Search page

Click "OSF Search" from the navbar on the left-hand side of any OSF page, or navigate to https://osf.io/search/.


Searching by Term 

In the search bar, enter a search term and then click the question mark or hit the "Enter" key. Your search term can be as short, long, specific, or general as you like.


Search Results

The total number of results for your query is presented in the top left.  If there are more than 10,000 results, it will only read "10,000+ results".  


Modify your query, use filters, or browse by object type to reduce and specify your results.


The search result cards provide a preview of the content within, with the key metadata based on the object type.


Each card represents one of the following in OSF:


All result cards will have a "Context" section. This section will preview exactly where your search term appears as part of this object, including a snippet of this relevant section. Your search term will be italicized. 


[placeholder for screenshot of context section on a results card]



View resource metadata 

After you have refined your search you will see a list of results. Each will display the metadata most relevant to that object type, indicating what is included in that resource. To see a full expanded version of the metadata click the {V} arrow button to expand the available metadata. 


Registration search results display supplemental information including connected open resources, the Registration provider, Registration template. 

Refine Results with Filters or Search Operators

If your search results are not precise enough, you can use several different methods to refine the results.

Browse by OSF Object Types

Use the object type filters above the search results to limit your results to a specific OSF object type, or display all object types.

The available object types are: Projects (or Project Component), Registrations (or Registration Component), Preprints, Files, and Users 

Using Filters

Filters allow users to refine their search results based terms present in the metadata of the results. These filters are dynamic in nature and will change the number of results and even add or reduce the other populating filters based on which items you select. 


Each option within the filters displays the number of results that will be available if you make that selection. Add additional facets to find the specific works that you are interested in exploring. 


Types of Filters:

Filter: Definition: 
Date Created  Allows you to create ranges of time when the results were created and posted 
Funder Filter by specific funders listed on the metadata of the result
Subject Filter based on the disciplines that you can choose for registrations and preprints
License Filter based on the license type of the results
Resource Type (General) An alternative way to filter based on resources in OSF metadata. See more here: Resource type in OSF
Institution  Filter based on current OSF institution members. Only current paid members will populate 
Provider  Provider corresponds to service providers located within the OSF
Is part of a collection Filter for materials that are part of Collections on the OSF
Registration Template  If you are searching for registrations you can filter by the template used during the creation 
Data Filter by different data types 
Subject Filter by subject type
Includes Community Schema

Find OSF content with attached community metadata standards via CEDAR. See more here: CEDAR and data management


How to search within a filter

Some filters contain a large number of values. In this case, not all values will be initially displayed, and may require a brief search in the facet dropdown. By default, the values within the filter that have the most relevant results will be listed first. 



To find additional values, search for the specific value that you are looking for. Only relevant results will populate. 

Using Search Operators

OSF uses a set of special characters to make more precise queries. Each is explained in more detail below:

  • + signifies AND operation
  • | signifies OR operation
  • - negates a single term
  • " wraps a number of terms to signify a phrase for searching
  • * at the end of a term signifies a prefix query
  • ( and ) indicates grouping of terms
  • ~ after a term enables broader or “fuzzy” matching

To use one of these characters literally (e.g. searching for “B+ blood type”), escape it with a preceding backslash (\): B\+ blood type.

+ signifies AND operation

This will ensure both terms MUST be present in results, but may be anywhere in the document in any order (boolean “AND” operation). This is the default way any terms will be combined in OSF, so even if you do not include the "+" sign between terms, your query will only return results with all of the terms.

Example 1: research + reproducible

https://osf.io/search?q=research%20%2B%20reproducible&search=research%20%2B%20reproducible

Example 2: research reproducible

https://osf.io/search?q=research%20%2B%20reproducible&search=reproducible

| signifies OR operation

If you want to broaden a search to include either of two terms or phrases, boolean “OR” operation replace the “OR” with | (pipe or vertical bar). This will ensure either of the terms in the query will be present in any results.

Example 1: maternal | paternal

https://osf.io/search?q=research%20%2B%20reproducible&search=maternal%20%7C%20paternal

Example 2: positive | affirmative

https://osf.io/search?q=research%20%2B%20reproducible&search=positive%20%7C%20affirmative


" " wraps a number of terms to signify a phrase for searching

Putting your search terms in quotes ensures adjacent words are searched as a phrase and not as individual words. Searching  “Maternal Depression” without quotes will return results where the database found those individual words anywhere in the source.

Example 1: “Maternal Depression” vs Maternal Depression

“Maternal Depression”

https://osf.io/search?q=%22maternal%20depression%22 ; 160 results all containing the phrase "Maternal Depression."

Maternal Depression

https://osf.io/search?q=maternal%20depression ; 486 results containing the terms "Maternal" and "Depression" anywhere in the metadata.

- Negates a single term

* at the end of a term searches for words that start with that term

The * (asterisk) operator is useful for finding multiple iterations of a word starting with the same letters. "CAT*" would return cat, cats, catholic, cathartic, etc.

Example: pos*

https://osf.io/search?search=pos*


~ after a word enables broader or "fuzzy" matching

This operator is useful when finding matches with words that are similar, but may have differences with character order or number (e.g. finding both color  and colour  or finding misspellings like accomodate  and acommodate ). The operator can be combined with a number to indicate how many differences are acceptable.

Example 1: girafe~1

returns results containing words that match the term or are different by any single character, or single difference in the order of characters.

https://osf.io/search?search=girafe~1

( ) indicates grouping of terms

Combining terms within parentheses ( ) creates groups of terms that can then be combined. This is helpful when combining multiple complex concepts. Groups can be combined or nested in multiple ways

Example 1: ("climate change") | ("global warming")

returns results containing "climate change" or "global warming"

https://osf.io/search?search=(%22climate%20change%22)%20%7C%20(%22global%20warming%22)

Example 2: (("climate change") | ("global warming")) "systematic review"

returns results with "climate change" or "global warming" AND the phrase "systematic review"

https://osf.io/search?search=%20((%22climate%20change%22)%20%7C%20(%22global%20warming%22))%20%22systematic%20review%22

How does OSF Search Determine Relevance


For blank searches of all of the OSF:

When a user first opens the OSF Search page, and no search terms have been entered, the OSF just retrieves a random set of results and returns that with the idea that you are seeing a sample of what is in OSF

For specific searches using terminology:

For the result of a search query, the algorithm we use to determine relevance is the default provided by the tool the OSF uses for indexing, Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/reference/8.9/similarity.html). Specifically, that algorithm is called Okapi BM25 (https://en.wikipedia.org/wiki/Okapi_BM25). Simplistically, the algorithm looks at all the words in the text (in this case the metadata record) and compares the total number of times the query words appear. That general idea is used in lots of ranking algorithms and is called TF-IDF (term frequency-inverse document frequency) https://en.wikipedia.org/wiki/Tf%E2%80%93idf

Example: if the query term appears 10 times in a 100-word record that would be a higher frequency compared to a record where the term appears 25 times in a 1,000,000-word document. So while the term appears more frequently, it occurs less often compared to the size of the text


How does the OSF determine

This Article Is Licensed Under CCO For Maximum Reuse.

cc-zero.png


Back to Main Support page

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.