Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Keyword matching

 
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Script needs to read excel file which has let's say column A having these set of keywords let' s say 100 such keywords in this column and
after reading these keywords from this column 'A' python script should search them to a particular path in D drive for all folders and subfolders
containing miscellaneous file types(.xml,.txt,.html,.yml,.sh...etc.)and once match is found lets say for first keyword it finds
a match in particular file at specific line number in this file and again it finds the same keyword in another file at some other line number
so here totally it was found 2 times and at different line nos in different files but it may also be the scenario that same keyword in same file
is found more than once on same line no itself or at different line no as well in the same file. So we need finally this statics for this keyword :- 1)
File names where it was found 2) At what lines ( it may be repeated more than one line nos as it may depend on the frequency of that particular keyword
how many times it's occuring in a single file itself ) in these files it was found. 3)  
Total count of that particular keyword where all it was found while searching all the file in the
given disk drive for all the files,folders,subfolders in it.
also after reading this excel file it should write the found these details :-1)Keyword matching file names 2) All the line nos where all
it was found with their file names as well 3) Total count for each keyword which were found different no. of times while searching in this
D drive( which has different types of files in it) for all the files,folders,subfolders in it.

Tried below code but it's not producing desired output:-

yourpath = 'D:\\Tool devlopment\\tool\\Folder
import glob
from collections
import Counter
import re
import errno
import os
import time
from datetime
import datetime
import datetime
import warnings
warnings.filterwarnings("ignore")
timestr = time.strftime("%Y%m%d-%H%M%S")

file2 = open("D:\\Tool devlopment\\tool\\Module\\excel.txt", 'r')
text_string1 = file2.read().lower()
word2 = text_string1.split()
sast_txt = "D:\\Tool devlopment\\tool\\Module\\SAST_ScanReport" + timestr + ".txt"
FO = open(sast_txt, 'w')
cnt = Counter()
for root, dirs, files in os.walk(yourpath, topdown = False):
   for name in files:
   path = os.path.join(root, name)
files = glob.glob(path)
for name in files:
   try:
   with open(name, encoding = "utf8", errors = 'ignore') as f:
   text_string1 = f.read().lower()
word1 = re.split("[^a-zA-Z]*", text_string1)
for i in range(0, len(word2)):
   if (word2[i] in set(word1)):
       cnt[word2[i]] += 1
str2 = "Vuernabilitiy found at line: " + str(i + 1) + ' in file ' + os.path.join(root, name) + "\n"
FO.write(str2)
str2 = ''
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
also tried below script as well but it's also not working:-

import os
import glob

def search_words(keyword, target_dir):
   files = glob.glob(target_dir + '/**', recursive = True)
python_files = []
results = []
line_no = []# Isolate target files from folders and everything
else
   for f in files:
   if f.endswith('.py'):
   python_files.append(f)

for pyf in python_files:
   with open(pyf, 'rb') as f:
   lines = f.readlines()
for i, line in enumerate(lines):
   line = str(line)
if line.find(keyword) > -1:
   line_no.append(i)

results.append({
   'keyword': keyword,
   'lines': line_no,
   'target_file': pyf,
   'total_found': len(line_no)
})
return results
 
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Please stop posting the same content again and again. Use the preview functionality to check whether you've used the code tags correctly.

And take ItDoesntWorkIsUseless to heart: Provide meaningful descriptions of what a piece of code is supposed to do, and what it actually does.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
As stated earlier in this post script should be able to search all type of files like .txt,.yml,.xml,.html,.sh ...etc. in all the folders and subfolders for that given path in system's D drive and then finally be able to append the count column which has total counts for all the corresponding found keywords from this excel (Here let's say Column 'A' in this excel has all these keywords in it) so against column A it should list count and in next column it should list files where it found these key words and in next column it should list file names with the line nos for the found keyword.

Thanks
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
python experts could you please help on the same?
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
any updates by python experts please?
 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Please stop nagging so often, it's not likely to make people want to help. Everybody here is a volunteer, and folks are not standing by to help you. If the issue is that urgent and important to you, you might consider hiring a consultant.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
ya, but any quick assistance would be much appreciated on this.

Thanks
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator

Jim ken wrote:python experts could you please help on the same?



Recall that Tim said "Provide meaningful descriptions of what a piece of code is supposed to do, and what it actually does." Even Python experts can't help until you do that.

In case it isn't obvious, just asking for assistance does not constitute a description of what your code is supposed to do, nor what it actually does.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
code is supposed to give result in a set format [keyword1,found in file name(at this lines),found in file name(at this line),(total found count of keyword1)] so for all the keywords which are listed in excel file column 'A' it should give these results that too in a graph format in html file (containing count etc. for keywords) but it's failing whle execution time itself giving 'invalid code identifier'.

Thanks
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Any updates please?
 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
You have been told before not to nag. Next time you do this, this topic will be closed. We may also close your account if that seems an easier path for the moderators.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
if no help can be offered then why this forum have been opened and code is self explanatory any python expert while reading it will be able to tell where is the mistake if this forum is not for help for users then what the benifit of this forum,this forum itself be closed.
 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
You have been told before how you might phrase a problem description so as to make it more likely that people can help you, but you have chosen not to do that. As you have also been told before, people here are volunteers, and not standing by to help anybody in particular. If people choose not to answer any particular question -possibly because it is phrased in a way so that it is hard to answer- then that is their valid choice.

On the other hand, these forums are heavily slanted towards Java - I'm sure there are much better forums for Python questions elsewhere on the net.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
tried this but it's showig incorrect line no.
#importing required packages

import glob
from collections import Counter
import re
import xlwt
from xlwt import Workbook
import xlsxwriter
import xlrd
import errno
import time
from datetime import datetime
import datetime
import os
import os.path
import warnings
from xlutils.copy import copy
import openpyxl

# opening excel file

from xlrd import open_workbook

warnings.filterwarnings("ignore")
timestr = time.strftime("%Y%m%d-%H%M%S")

# path where all the folders and sub folders need to be searched.

yourpath = "D:\\mainfolder\\subfolders"


# location where excel containing all the keywords which are to be searched in above path.
loc = ("D:\\sample.xlsx")

cnt = Counter()
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
rows = sheet.nrows

excel_word=[]
# loop to pick up all the keywords from the column of excel one after another.

for i in range(1,rows):
   excel_word.append(sheet.cell_value(i,1))

# report to be generated in this location.
report_txt="D:\\mainfolder\\report"+timestr+".txt"

# report is opened in write mode.

FO = open(report_txt, 'w')

# structure layout of text file where records will be written.

str3="|"+"Pattern"+" "*(20-len("Pattern"))+"|"+"Vuernabilitiy  in file"+" "*(200-len("Vuernabilitiy  in file"))+"|"+"Line No"+" "*(10-len("Line No"))+"|" +"\n"
FO.write(str3)
FO.write("--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------"+"\n")

# main logic

for root, dirs, files in os.walk(yourpath, topdown=False):
   
   line=0
   for name in files:
       
       path=os.path.join(root, name)
       files = glob.glob(path)
       
       for name in files:
               
               
               try:
                   with open(name,encoding="utf8",errors='ignore') as f:
                       
                       text_string1 = f.read()
                       for i in range(0,len(excel_word)):
                           str2=''
                           for num, line in enumerate(name, 1):
                               if excel_word[i] in text_string1:
                                   cnt[excel_word[i]] += 1
                           
                                   str2="|"+excel_word[i]+" "*(20-len(excel_word[i]))+"|"+os.path.join(root, name)+" "*(200-len(os.path.join(root, name)))+"|"+str(num)+" "*(10-len(str(num)))+"|" +"\n"
                               
                           else:
                               cnt[excel_word[i]]+=0
                           FO.write(str2)
                           
                           
                                         
               except IOError as exc:
                   if exc.errno != errno.EISDIR:
                       raise

FO.close()
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
experts, could you please advise now.
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Please show some patience, and ease up. We're all volunteers, even staff members, and not everybody has time to come to the forum every hour.
 
Jim ken
Ranch Hand
Posts: 45
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
Ok,hopefully with in next 24 hour some expert will resolve this issue on this forum.

Thanks
 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
You have been asked repeatedly not to nag, yet you persist in it. I'm closing this topic.
 
Hey, I'm supposed to be the guide! Wait up! No fair! You have the tiny ad!
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
    Bookmark Topic Watch Topic
  • New Topic