Hello everyone! I want to tell you how I decided on data mining and how I am progressing.
First of all, I am studying both data mining and software in the department I studied.When I was in my school, I always wondered whether I want to be a software developer or a data miner.but I had no time to decide.When the pandemic happened, I had time to think. what do I want?
I decided to do an internship in a software company to see the environment I want to work in.I learned languages and sql that I do not know but I realized that I did not want to develop a desktop application or an android application.
I went to a bigger city and started an internship in the R&D department of a cyber security company.
I was very confused and realized that meeting with my mentor would help me more. He said that when I first met him I was very confused. we thought of drawing me a road map. Later we were talking about this at the company where I did an internship.
First of all, if I was to progress in the software branch, I had to decide which software language to choose.
I decided to work with python. When I looked at the education I received at school, I realized that I enjoyed mathematics and I received an assignment from the company I was in for machine learning.
In the talks with my mentor, he said that I was on the right track and that I was more determined. He said that I should record what I did while drawing my roadmap, and decide what to do next and write them down as well.
I would like to tell you what I do while doing machine learning and what are the difficult parts of them.
First of all, I would like to tell you what I do while doing machine learning and what are the hardships of them.I first researched what is machine learning and how it is done.When I first found out, I said clear the data and link it to an algorithm.but when I succeeded one stage, a different stage emerged. I always noticed a deficiency when I said I was approaching the result.
it was a very difficult process to make sense of the data.
first I thought about what format I want the data in.I proceeded from the result to the beginning because I wanted to write an algorithm according to the result.
There were too many missing values and commas in my data and they had to be in rows.
How would I proceed without losing the row layout?
I’ve never done a file I / o operation before.
We can read row by row data using readlines().yes I thought I had come a long way.
But I got scared again when I saw that the new term waiting for me was ‘search’.I only intervened with the data via excel ‘ctrl + f’.my friend from my workplace said that there is a structure called regex and it will be useful if I investigate.When I searched, I found a function called findall but it didn’t work.because when I made readlines, I actually made them a list.
I realized I found something wrong.I researched again.
On the one hand, I learned regex while researching.
I found the search function.
this time I encountered a different error.It was working the way I wanted it to be, but it couldn’t print it out.This is because I had to group it when outputting this function.it sounded ridiculous, but I still did it. I got what I wanted.
if level_search: level=level_search.group()
I thought I did the hard work. but what would I do with the values I found?
I wrote 4 lines of code, but a lot of time is gone.hard to find way I have found but it looks too small.Anyway I return to the topic. i decided i had to change what i found and print in a format.At first I thought of changing the values and putting them into the format, but it didn’t. I tried hard but it never worked that way.
so I had to format it first. but how?
I’m here again after a long time.In order to organize it, I first need to clean up the pollution in the data.because the data I have is in ‘level=””traffic’ format.
I want to search and get only (=) the right side of the value I find. And I don’t want commas. I will use the split function too.
splitted_level= level.split(‘ “” ’)
I split it from commas. I did two operations in one line.I feel perfect.I can now use it as an index.
but what do I do with this index? How do I format them?
the way I used in a project I did before came to mind.I saved the data I would use as a array.Then I can store the data I split as an array.I did defined an empty array. first_format= I can add my data there.
My friend said you can use append.I learned this at school.first time, I use a function I learned in school.
first_format.append(splitted_level+’,’) I used index  here.Because I had splitted data.and i learned that commas work when i use csv.I added commas next to them and I succeeded.
Now my data is in the format I want.
So what would I write to the value places I couldn’t find when I regexed?
I remembered using if.
If I can’t find I can add a value.
I dont know how this came to my mind.I feel my algorithm thinking improving now. all that’s left is to rewrite this for every data I want and add for ‘for i in data:’
Now my format is ready.I can now replace the data it with the values I want.
After a long research I learned that English is really necessary.because I could not find what I was looking for. Not used for replace list.Other than that,I find requires too much for.I rested my brain a little.
what am i looking?
After long research I found a change algorithm.meets my needs without much confusion. find and replace multiple words in a file python .I added the things I want to change here. And i did.
I did now parsed my data.seemed like it would never end.I have reached what I have to do now.
I think my way is getting shorter.I start research now.
How do I use machine learning algorithm?