A scientific platform that finds, "reads" and gives personalized summaries of scientific papers
Image credit: Scheme presenting my idea. Red triangle presents the innovative part of creating personalized summaries for each user.
J. NikolaDec 10, 2021
Please leave the feedback on this idea
Is it original or innovative?
Is it feasible?
Is it targeting an unsolved problem?
Is it concisely described?
Bounty for the best solution
Provide a bounty for the best solution
Bounties attract serious brainpower to the challenge.
I once read a fact that men in academia spend 20% of their total time at work just searching for papers. Many times you probably did the same, found a paper, read the abstract and it was completely useless. Then you started reading the paper and after 15 min, you realized it is not important for your topic. You repeated this multiple times, but, since it is hard to keep up with all the new papers daily, you somehow skipped that important paper that was published 2 months ago.
To avoid all this, I would like to develop a platform that would search, digest, summarize and personalize the scientific papers so you can read and understand things you are interested in fast and more efficiently. The user could get new paper summaries daily, by the platform feed, app feed, or the physical custom-printed newspapers.
Categorization of the user
The user would create an account on the platform and fill in the field of interest, keywords of interest, experience, skills, workshops, certificates, etc. All of this could also be imported from LinkedIn by connecting the accounts. This would help the algorithm categorize you into one of the groups, based on your level of understanding of certain field, topic, etc.
The search part
The "search" algorithm would then search the databases such as Google Scholar, PubMed, ScienceDirect and similar for papers that could be of your interest, based on the keywords and fields of interest. Anything that is considered as interesting is then subjected to the "digesting" algorithm.
The read-and-understand part
The "digesting" algorithm would do the same what the compiler does to the source code on computer - recognize grammar simbols, recognize sentences and paragraphs, words, verbs, pronouns, etc. and "understand" what's written.
The personalized summary part
Once the paper is digested, the "summarization" task would be given to one of the supervised writing algorithms. The algorithm would then "write" a summary containing all the info you could be interested in, in the shortest and the most understanding way for you. If you are a student with no knowledge, the summary would consist of basic info that gives you a short overview of what the scientist did in the paper. On the other hand, if you are an expert with high interest in methods, you would get short but detailed summary with focus on methods.
What new here?
If you search and read scientific papers, you are familiar with PubMed and ScienceDirect and that's searching is not something new. Some of them, plus Mendeley, Stork and others, offer you to subscribe to new papers from the field of interest. That's not new either. Neither is the digesting of the papers and extracting important information (done by Scholarcy).
The only innovative approach here is the "personal" summarization. It would be done by many GPT3 algorithms that are supervised in a way to prefer certain way of writing. That way not only the user would subscribe to new, specific content, but would also get it presented in a way he/she understands the best. Simple version of this can be seen in the concept can be seen (and copied) in Rewordify or Simplish.
What if some papers are "locked" and you need to pay for the access?
The platform would pay to get access to all the most common journals and databases, so it can search for papers by scanning the full text. It would not give the users free access. As the platform would give the user a personalized summary, he/she would still need to buy the paper, if it is not published in the open access journal. The add-on could be the premium account which includes the free access to all the journals.
What do you think about the idea?
What could be the potential problems?
Do you have any idea what to add?
summaries focused on the subject of interest ("For example, for articles 100% relevant to me I would like an overall balanced summary. For articles that I am only interested in a method but I don't care much for the rest I would like more information on that specific method.") Michaela D
impact factor information clearly visible Michaela D
set up the word limit to decide what length of the abstract you want + help guide the "digest" function algorithm Shubhankar Kulkarni
determine the complexity of the search terms to decide what kind of information to provide to the user Shubhankar Kulkarni
There are Natural Language Processing (NLP) algorithms for document summaritization. There is even a scientific (S&T) TLDR paper describing an approach: https://arxiv.org/abs/2004.15011
This could be implemented relatively straight forward. I've actually used the scitldr algorithm on research papers. That, plus keyphrase extraction, really allows someone to quickly make sense or filter down to the papers they want to read.
The challenge is getting access to the S&T papers' full-text
Now, there was a research study on the quality of outcomes when using of abstracts (to which are much easier to get access) vs. full-text ... for NLP of those texts. But, I don't know if it related to document summaritization, though.
My gut says you need the full-text.
So, a web service would have to have a paper submitted to it, or it would need to be an open-access paper. The OA papers could be looked up, then their page or PDF downloaded, and the sci tldr processing performed.