Find repeated sentences in word

I want to detect sentences, long phrases, and possibly paragraphs that have been repeated in a document. I’ve been working on a document, and want to make sure I haven’t copied the same or similar text into more than one spot.

Ideally, the application should be available online, or easily installable on an OS X Mavericks computer with Pages but not Microsoft Word installed.

I came across Pro Writing Aid, but its «Repeat Words & Phrases» seems to me very noisy — some individual words are highlighted by it merely for occurring a fair bit.

I have also seen Online-Utility.org’s Text Analyzer, which isn’t bad. However, there is some redundancy in its information. If there’s a seven word phrase that’s repeated twice, it also mentions the two six word phrases consisting of words one to six and two to seven as occurring twice. Also, it’s hard to visualise the results, and see if there are particular sections that have a high amount of duplicated text.

This question is different from Program to search for word repetitions in text documents and Program to search for word repetitions in Word documents in that they’re asking about one word being repeated by the following word.

I would like to know how I could find similarity within the same sentence.
I have a list of sentences like these:

my_list=["do you want pizza for dinner? Do you want pizza for dinner?", "I like pizza", "I have no money I have no money"]

I would like to create a pandas dataframe where, if a sentence is repeated within the same, I assign 1, otherwise 0.

Something like this:

Text                                                              Repeated?
do you want pizza for dinner? Do you want pizza for dinner?            1
I like pizza                                                           0
I have no money I have no money                                        1

I was thinking of something like this:

from collections import Counter


my_list = dict(Counter(my_list.split()))
for i in sorted(my_list.keys()):
    print ('"'+i+'" is repeated '+str(my_list[i])+' time.')

Then counting how many words there are in total and how many unique words there are in total in that sentence. But I think it would be not good as coding.
Do you know if there is another way to get the expected result?

Do you know what phrases you overuse when you write? I am well-aware that I use the phrase «over and over,» well, over and over. One time I was editing a scene of about 600 words. My main character nodded four times! I just imagined her as a bobble-head doll.

Every writer has phrases that they overuse. For fiction writers, we often use the same few actions. Raising an eyebrow is a favorite. J.K. Rowling’s characters raise an eyebrow over 128,000 times. George R.R. Martin’s characters do it over 88,000 times! But every person who writes will repeat phrases.

Sometimes, repetition can be a powerful literary device to show emphasis. But typically, too much repetitiveness will make your writing feel robotic and boring. Luckily, ProWritingAid is here to help. There are multiple reports you can run to find unintended repetitiveness.

repeatscheck

All Repeats

Within the editing tool, there is a section called «Repeats.» There are two different reports you can run. The first is «All Repeats.» This report will check your writing for repeated phrases. The report will populate on the left-hand side, and it organizes by the length of the phrase. The longest repeated phrases will show up first.

Of course, sometimes these words need to be repeated. But you’ll be surprised at how often you use certain phrases!

ProWritingAid Repeats Reports

Echoes

The next report in the «Repeats» section is «Echoes.» «Echoes checks for overused words and phrases within a certain word count limit. For instance, if you use the word «I» too many times in a 150-word paragraph, your writing will sound clunky and redundant.

The «Echoes» check is very customizable. Simply click on «Settings» at the top of the editing tool, then click «App Settings.» This will open a new page. Scroll down to «Repeats Settings.» Here you can set the maximum word count distance for your document. Do you only care about echoing words and phrases within 200 words or do you want to see how many echoes you have in 600 words? That’s up to you.

You can also set the minimum number of words within a phrase that it will check. If you want to check single words, change this number to one. This number is completely customizable. The last box allows you to change the minimum phrase occurrence. If something only repeats twice, you might not care to change it. You can make this number higher or lower depending on your preferences.

Style Report

The Style Report is one of the most powerful tools in ProWritingAid. It checks things like adverb usage, emotion tells, and readability. It also checks for repeated sentence starts.

When three or more sentences in a row start with the same word, your writing will read repetitive and mechanical. Although sometimes this can be a strategic literary device called anaphora, more often than not, it’s unintended repetitiveness. When you’ve looked at the same piece of writing many times, it’s easy to miss these repeated sentence starts.

overusedwords

Summary Report

Finally, the Summary Report can also help with repetitiveness. The Summary Report analyzes many factors of your document, and a few of these analytics deal with repetition. First, it can tell you your most used words. It also tells you your overused words and how many you should aim to cut from your writing.

Lastly, it gives you an overview of your Sticky Sentences. While not strictly repetition, Sticky Sentences contain too many glue words, which are the 200 most common words (besides personal pronouns).

gluewords

Using too many of these slows your reader down and makes your writing sound robotic and clunky. If you repeatedly use glue words, this could be the report for you!

Final Thoughts

Refine your writing and eliminate all your unintentional repetitiveness with ProWritingAid. Try it now!

Not Using ProWritingAid Yet? Try ProWritingAid’s Editor For Yourself:

Have you tried  ProWritingAid  yet? What are you waiting for? It’s the best tool for making sure your copy is strong, clear, and error-free!

Krystal N. Craiker

Krystal N. Craiker is the Writing Pirate, an indie romance author and blog manager at ProWritingAid. She sails the seven internet seas, breaking tropes and bending genres. She has a background in anthropology and education, which brings fresh perspectives to her romance novels. When she’s not daydreaming about her next book or article, you can find her cooking gourmet gluten-free cuisine, laughing at memes, and playing board games. Krystal lives in Dallas, Texas with her husband, child, and basset hound. Check out her website or follow her on Instagram: @krystalncraikerauthor.

[H]ard|Forum

  • Bits & Bytes

  • General Software

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

finding repeated phrases in word


  • Thread starter

    silk186


  • Start date

    Mar 12, 2017

  • #1

silk186

Joined
Feb 26, 2008
Messages
1,628


I’m writing a thesis in word and I often end move bits around. I’m wondering if there is a function or plug-in to find repeated phrases and sentences in a Word Document.

  • #2

DrLobotomy

Joined
May 19, 2016
Messages
6,736


Save as a text file and then use grep in Linux.

  • #3

muz_j

Joined
Jul 13, 2014
Messages
227


1. open the «Advanced Find» from the drop-down menu. The «Find and Replace» window pops up.

2. Type the word you want to search for in the Find What box.

4. Click the «More» button at the bottom of the window to view more options.

5. Place a check mark in front of the «Find whole words only» option.

6. Click the «Reading Highlight» button and then «Highlight All» to find all duplicate words and highlight them.

7. Click «Close» to close the Find And Replace window. The results remain highlighted.

  • Bits & Bytes

  • General Software

Hello Experts,
1. do the duplicate sentences have to be adjacent to each other?

No, the duplicate sentences can be appear any where in the document.

2. «single word appears» — not sure how this pertains to duplicate sentence detection.  Please elaborate.

For Example: there are multiple tables which has headings «Name», «Address»,….
So for all table this «Name» is duplicate single word because it has appears in almost all tables and this needs to be highlighted.

3. How big will these documents be?

This document can be of 300 to 600 pages.

4. Is it possible that you need to find multiple sentence copies (3 or more)?

Yes, duplicate sentences can be two or more than two.

5. Once the duplicates have been identified, what do you do next?

Once the duplicates have been identified, highlight all the duplicates sentences.

Thanks a lot,
Shailesh

Like this post? Please share to your friends:
  • Find percentage in excel
  • Find pattern in excel
  • Find path of excel
  • Find passive voice in word
  • Find parts of speech of a word