A scientific platform that finds, "reads" and gives personalized summaries of scientific papers
Image credit: Scheme presenting my idea. Red triangle presents the innovative part of creating personalized summaries for each user.
jnikolaDec 10, 2021
Please leave the feedback on this idea
Originality
Is it original or innovative?
Feasibility
Is it feasible?
Necessity
Is it targeting an unsolved problem?
Conciseness
Is it concisely described?
Bounty for the best solution
Provide a bounty for the best solution
Bounties attract serious brainpower to the challenge.
I once read a fact that men in academia spend 20% of their total time at work just searching for papers. Many times you probably did the same, found a paper, read the abstract and it was completely useless. Then you started reading the paper and after 15 min, you realized it is not important for your topic. You repeated this multiple times, but, since it is hard to keep up with all the new papers daily, you somehow skipped that important paper that was published 2 months ago.
To avoid all this, I would like to develop a platform that would search, digest, summarize and personalize the scientific papers so you can read and understand things you are interested in fast and more efficiently. The user could get new paper summaries daily, by the platform feed, app feed, or the physical custom-printed newspapers.
The concept
Categorization of the user
The user would create an account on the platform and fill in the field of interest, keywords of interest, experience, skills, workshops, certificates, etc. All of this could also be imported from LinkedIn by connecting the accounts. This would help the algorithm categorize you into one of the groups, based on your level of understanding of certain field, topic, etc.
The search part
The "search" algorithm would then search the databases such as Google Scholar, PubMed, ScienceDirect and similar for papers that could be of your interest, based on the keywords and fields of interest. Anything that is considered as interesting is then subjected to the "digesting" algorithm.
The read-and-understand part
The "digesting" algorithm would do the same what the compiler does to the source code on computer - recognize grammar simbols, recognize sentences and paragraphs, words, verbs, pronouns, etc. and "understand" what's written.
The personalized summary part
Once the paper is digested, the "summarization" task would be given to one of the supervised writing algorithms. The algorithm would then "write" a summary containing all the info you could be interested in, in the shortest and the most understanding way for you. If you are a student with no knowledge, the summary would consist of basic info that gives you a short overview of what the scientist did in the paper. On the other hand, if you are an expert with high interest in methods, you would get short but detailed summary with focus on methods.
What new here?
If you search and read scientific papers, you are familiar with PubMed and ScienceDirect and that's searching is not something new. Some of them, plus Mendeley, Stork and others, offer you to subscribe to new papers from the field of interest. That's not new either. Neither is the digesting of the papers and extracting important information (done by Scholarcy).
The only innovative approach here is the "personal" summarization. It would be done by many GPT3 algorithms that are supervised in a way to prefer certain way of writing. That way not only the user would subscribe to new, specific content, but would also get it presented in a way he/she understands the best. Simple version of this can be seen in the concept can be seen (and copied) in Rewordify or Simplish.
Possible problems
What if some papers are "locked" and you need to pay for the access?
The platform would pay to get access to all the most common journals and databases, so it can search for papers by scanning the full text. It would not give the users free access. As the platform would give the user a personalized summary, he/she would still need to buy the paper, if it is not published in the open access journal. The add-on could be the premium account which includes the free access to all the journals.
What do you think about the idea?
What could be the potential problems?
Do you have any idea what to add?
Contributors updates:
summaries focused on the subject of interest ("For example, for articles 100% relevant to me I would like an overall balanced summary. For articles that I am only interested in a method but I don't care much for the rest I would like more information on that specific method.") Michaela D
impact factor information clearly visible Michaela D
set up the word limit to decide what length of the abstract you want + help guide the "digest" function algorithm Shubhankar Kulkarni
determine the complexity of the search terms to decide what kind of information to provide to the user Shubhankar Kulkarni
There are Natural Language Processing (NLP) algorithms for document summaritization. There is even a scientific (S&T) TLDR paper describing an approach: https://arxiv.org/abs/2004.15011
This could be implemented relatively straight forward. I've actually used the scitldr algorithm on research papers. That, plus keyphrase extraction, really allows someone to quickly make sense or filter down to the papers they want to read.
The challenge is getting access to the S&T papers' full-text
Now, there was a research study on the quality of outcomes when using of abstracts (to which are much easier to get access) vs. full-text ... for NLP of those texts. But, I don't know if it related to document summaritization, though.
My gut says you need the full-text.
So, a web service would have to have a paper submitted to it, or it would need to be an open-access paper. The OA papers could be looked up, then their page or PDF downloaded, and the sci tldr processing performed.
Please leave the feedback on this idea
jnikola3 years ago
Hi Joe J! It's great to have someone understanding algorithms here. Do you think these algorithms could be modifiable to summarize the papers in a certain "style"? What I want to ask is if you can set the writing style for these summarization algorithms or that's unique for the algorithm?
Please leave the feedback on this idea
Add your creative contribution
General comments
Shubhankar Kulkarni3 years ago
Definitely a problem and a good solution as well. Apart from the problems already mentioned, I am worried about the types of users who can use this platform. It may be easier for academicians (even for those from different disciplines) but not for non-academic users. In such a case, the search function may not be able to find the most relevant papers. Even for me, certain common search terms (which will be the ones used by non-academic majorly; for example, they may want to know about COIVD-19 or diabetes or certain well-known hormones like dopamine and serotonin but not about nucleoporins, for example) give numerous results on Pubmed. Sometimes, I need to go through multiple abstracts to select the right paper. Non-relevant information might be off-putting for such users. How do we deal with this?
Please leave the feedback on this idea
jnikola3 years ago
Shubhankar Kulkarni I am glad you asked. I imagined the platform to be adapted (at the beginning) for three types of users - low level of knowledge people (freshmen students, enthusiastic kids, everyday users), middle level, and skilled professionals. Articles would be searched, digested into words, phrases, sentences, paragraphs, and the context would be drawn. The supervised algorithms would then independently "write" personalized abstracts, whose complexity and focus would be based on the background, education & skills of the user.
To answer your question, there are two important things to consider.
The first one is the importance of accurate profile data input, which would be the basis for both, the algorithm search&find function and the "abstract writing" function setup. If you enter no data or you state that you don't have a high level of knowledge from the area of interest, the algorithm would find articles with experiments, materials, methods, and conclusions which are less complex. In that way, you would get only the papers dealing with the desired subject, with no nucleoporins or similar non-relevant information. If you are a skilled Ph.D. student, you would get more complex papers that are focusing on methods, materials, or certain aspects of the subject, depending on your previous papers and the keyword input.
The second thing to consider and, in my opinion, the most important one, would be the user feedback on every article that was personalized for you. It doesn't matter if you use the web platform or the smartphone app, after you read your personalized abstract (I imagine it to be like scrolling through the news feed on Facebook or Instagram), you would have a set of questions regarding the article. The questions, such as "Was this article too long?, Do you find it understandable?, Was it helpful to you?, Was it easy to read?, etc., would be the mechanism by which the "search&find" and "article writer" functions are trained. If you rate an article as "too complex", the next time you would get articles with less noise and more signal. If the article was perfectly clear to you, the algorithm would give you more complex articles next time. See what I did here? :) --> That's also how the platform would become your favorite news and learning playground.
Please leave the feedback on this idea
Shubhankar Kulkarni3 years ago
Juranium The feedback thing is smart! I like it. I also thought of a way to bypass the inclusion of criteria (mid-level, low-level, and high-levels users). Let me know what you think. The searches of low-level users will definitely be different from those of high-level users. Continuing with the previous example, low-level users will use terms like "diabetes", "cancer", "nutrition" or a bit more specific like "nutrition in pregnancy", etc. The high-level users will use the terms like "role of nucleoporin 1 in gastric cancer". So, these search terms themselves can be used to decide what kind of information to provide to the user. Additionally, the user could be asked to provide a word limit for the software to adhere to while providing the review. The work limit could also act as a proxy for "how deep the software should go with the search".
Please leave the feedback on this idea
jnikola3 years ago
Shubhankar Kulkarni Great insight! If combined with the user feedback and the profile description, now we reached a solid level of personalization. The higher the level it is, the better should be the experience. As you mentioned, search terms could help narrow the search and world limit to shape the personalized abstract. Very cool and useful details, I'll add them to the session description!
Please leave the feedback on this idea
Michaela D3 years ago
AI will definitely become huge when it comes to finding and summarizing information. In terms of personalization, I would be more interested in the piece of information that is emphasized rather than the writing style. For example, for articles 100% relevant to me I would like an overall balanced summary. For articles that I am only interested in a method but I don't care much for the rest I would like more information on that specific method.
A detail: I would also like the impact factor of the journal to be mentioned so that it saves me the time of looking for it.
Please leave the feedback on this idea