![]() |
|
|
|||||||
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
|
|||
|
Finding Similar entries in one column in another
Hi,
I am relatively new to using Excel with vast amounts of data and was after any suggestions which could help me complete a problem that I have.. I have two columns with descriptions in (approximately 10 words). I need to find descriptions from one column (Sheet 1,column C) which are similar to the other column (Sheet 2,Column A).For example,if I could display the number of words which are similar to both columns I could then find the ones with the largest number of similar words. For example, the two columns could look like... Sheet1,column C.... Description 300g Fish fillets Sheet2,column A Description Fish fingers 300g This would then give the result that there were 2 similar words. Any help would be much appreciated as to how to carry this out... |
|
|||
|
Re: Finding Similar entries in one column in another
How about this page?
http://www.dailydoseofexcel.com/arch...zy-text-match/ HTH, JP On Jul 24, 7:18*am, Duguid1 <Dugu...@discussions.microsoft.com> wrote: > Hi, > I am relatively new to using Excel with vast amounts of data and was after > any suggestions which could help me complete a problem that I have.. > > I have two columns with descriptions in (approximately 10 words). I need to > find descriptions from one column (Sheet 1,column C) which are similar tothe > other column (Sheet 2,Column A).For example,if I could display the *number of > words which are similar to both columns I could then find the ones with the > largest number of similar words. > > For example, the two columns could look like... > > Sheet1,column C.... > Description > 300g Fish fillets > > Sheet2,column A > Description > Fish fingers 300g > > This would then give the result that there were 2 similar words. > > Any help would be much appreciated as to how to carry this out... |
|
|||
|
Re: Finding Similar entries in one column in another
On Jul 24, 7:18*am, Duguid1 <Dugu...@discussions.microsoft.com> wrote:
> Hi, > I am relatively new to using Excel with vast amounts of data and was after > any suggestions which could help me complete a problem that I have.. > > I have two columns with descriptions in (approximately 10 words). I need to > find descriptions fromonecolumn(Sheet 1,columnC) which aresimilarto the > othercolumn(Sheet 2,ColumnA).For example,if I could display the *numberof > words which aresimilarto both columns I could then find the ones with the > largest number ofsimilarwords. > > For example, the two columns could look like... > > Sheet1,columnC.... > Description > 300g Fish fillets > > Sheet2,columnA > Description > Fish fingers 300g > > This would then give the result that there were 2similarwords. > > Any help would be much appreciated as to how to carry this out... Solution: Tokenize the word lists [using custom VBA -- see footnote -- or a series of FIND functions] so you have one word per column. Now, each row in list one needs a search result for each row in list 2. This will return the hit count of words in List1 found in each row of List2 (array formula) {=SUM(COUNTIF(OFFSET(List2,J$1-1,0),($B2:$G2)))} List2 is a named range that points to /only the first row/ of the tokenized list of words from the second list. The OFFSET piece increments which row is actually scanned. J$1 etc. contains the row offset. 1 means scan the first row of List2. 2 means scan the second row if List2, etc. $B2:$G2 is the tokenized word list for List1. Don't forget it's an array formula! (Ctrl+Shift+Enter). Footnote : I can post the VBA code module and a workbook that demonstrates how the formulas work on my website later today if you like. The VBA code is not strictly necessary, but it makes tokenizing the words much easier. |
|
|||
|
Re: Finding Similar entries in one column in another
Recursive Error wrote:
> On Jul 24, 7:18 am, Duguid1 <Dugu...@discussions.microsoft.com> wrote: >> Hi, >> I am relatively new to using Excel with vast amounts of data and was after >> any suggestions which could help me complete a problem that I have.. >> >> I have two columns with descriptions in (approximately 10 words). I need to >> find descriptions fromonecolumn(Sheet 1,columnC) which aresimilarto the >> othercolumn(Sheet 2,ColumnA).For example,if I could display the number of >> words which aresimilarto both columns I could then find the ones with the >> largest number ofsimilarwords. >> >> For example, the two columns could look like... >> >> Sheet1,columnC.... >> Description >> 300g Fish fillets >> >> Sheet2,columnA >> Description >> Fish fingers 300g >> >> This would then give the result that there were 2similarwords. >> >> Any help would be much appreciated as to how to carry this out... > > Solution: > > Tokenize the word lists [using custom VBA -- see footnote -- or a > series of FIND functions] so you have one word per column. > > Now, each row in list one needs a search result for each row in list > 2. > > This will return the hit count of words in List1 found in each row of > List2 (array formula) > > {=SUM(COUNTIF(OFFSET(List2,J$1-1,0),($B2:$G2)))} > > List2 is a named range that points to /only the first row/ of the > tokenized list of words from the second list. > > The OFFSET piece increments which row is actually scanned. J$1 etc. > contains the row offset. 1 means scan the first row of List2. 2 means > scan the second row if List2, etc. > > $B2:$G2 is the tokenized word list for List1. > > Don't forget it's an array formula! (Ctrl+Shift+Enter). > > Footnote : I can post the VBA code module and a workbook that > demonstrates how the formulas work on my website later today if you > like. The VBA code is not strictly necessary, but it makes tokenizing > the words much easier. I have posted a VBA-free version of a workbook that demonstrates the solution. The code module that contains the handy tokenizing function (called "GetString") is offered separately for your inspection before use. http://vfdrake.home.comcast.net/file...word_count.xls http://vfdrake.home.comcast.net/file...t_handlers.bas [Yes, I am Recursive Error too...] Enjoy. |
|
|||
|
Re: Finding Similar entries in one column in another
smartin wrote:
> Recursive Error wrote: >> On Jul 24, 7:18 am, Duguid1 <Dugu...@discussions.microsoft.com> wrote: >>> Hi, >>> I am relatively new to using Excel with vast amounts of data and was >>> after >>> any suggestions which could help me complete a problem that I have.. >>> >>> I have two columns with descriptions in (approximately 10 words). I >>> need to >>> find descriptions fromonecolumn(Sheet 1,columnC) which aresimilarto the >>> othercolumn(Sheet 2,ColumnA).For example,if I could display the >>> number of >>> words which aresimilarto both columns I could then find the ones with >>> the >>> largest number ofsimilarwords. >>> >>> For example, the two columns could look like... >>> >>> Sheet1,columnC.... >>> Description >>> 300g Fish fillets >>> >>> Sheet2,columnA >>> Description >>> Fish fingers 300g >>> >>> This would then give the result that there were 2similarwords. >>> >>> Any help would be much appreciated as to how to carry this out... >> >> Solution: >> >> Tokenize the word lists [using custom VBA -- see footnote -- or a >> series of FIND functions] so you have one word per column. >> >> Now, each row in list one needs a search result for each row in list >> 2. >> >> This will return the hit count of words in List1 found in each row of >> List2 (array formula) >> >> {=SUM(COUNTIF(OFFSET(List2,J$1-1,0),($B2:$G2)))} >> >> List2 is a named range that points to /only the first row/ of the >> tokenized list of words from the second list. >> >> The OFFSET piece increments which row is actually scanned. J$1 etc. >> contains the row offset. 1 means scan the first row of List2. 2 means >> scan the second row if List2, etc. >> >> $B2:$G2 is the tokenized word list for List1. >> >> Don't forget it's an array formula! (Ctrl+Shift+Enter). >> >> Footnote : I can post the VBA code module and a workbook that >> demonstrates how the formulas work on my website later today if you >> like. The VBA code is not strictly necessary, but it makes tokenizing >> the words much easier. > > I have posted a VBA-free version of a workbook that demonstrates the > solution. The code module that contains the handy tokenizing function > (called "GetString") is offered separately for your inspection before use. > > http://vfdrake.home.comcast.net/file...word_count.xls > http://vfdrake.home.comcast.net/file...t_handlers.bas > > [Yes, I am Recursive Error too...] > > Enjoy. You will find the solution I gave returns undesired results when the lists have varying word counts--which is likely. This is because any blank cells in the tokenized List1 will match blank cells in the tokenized List2. The following formula will return only word matches, excluding the blank cells: {=SUM(COUNTIF(OFFSET(List2,J$1-1,0),($B2:$G2)))-SUM(COUNTIF(OFFSET(List2,J$1-1,0),""))*SUM(COUNTIF($B2:$G2,""))} |
![]() |
|
| Thread Tools | Search this Thread |
| Display Modes | |
|
|