My Log About Machine Learning

Hello everyone! I want to tell you how I decided on data mining and how I am progressing.

First of all, I am studying both data mining and software in the department I studied.When I was in my school, I always wondered whether I want to be a software developer or a data miner.but I had no time to decide.When the pandemic happened, I had time to think. what do I want?

I decided to do an internship in a software company to see the environment I want to work in.I learned languages ​​and sql that I do not know but I realized that I did not want to develop a desktop application or an android application.

I went to a bigger city and started an internship in the R&D department of a cyber security company.

I was very confused and realized that meeting with my mentor would help me more. He said that when I first met him I was very confused. we thought of drawing me a road map. Later we were talking about this at the company where I did an internship.

First of all, if I was to progress in the software branch, I had to decide which software language to choose.

I decided to work with python. When I looked at the education I received at school, I realized that I enjoyed mathematics and I received an assignment from the company I was in for machine learning.

In the talks with my mentor, he said that I was on the right track and that I was more determined. He said that I should record what I did while drawing my roadmap, and decide what to do next and write them down as well.

I would like to tell you what I do while doing machine learning and what are the difficult parts of them.

First of all, I would like to tell you what I do while doing machine learning and what are the hardships of them.I first researched what is machine learning and how it is done.When I first found out, I said clear the data and link it to an algorithm.but when I succeeded one stage, a different stage emerged. I always noticed a deficiency when I said I was approaching the result.

it was a very difficult process to make sense of the data.

first I thought about what format I want the data in.I proceeded from the result to the beginning because I wanted to write an algorithm according to the result.

There were too many missing values ​​and commas in my data and they had to be in rows.

How would I proceed without losing the row layout?

Image for post
Image for post

I’ve never done a file I / o operation before.

We can read row by row data using readlines().yes I thought I had come a long way.

But I got scared again when I saw that the new term waiting for me was ‘search’.I only intervened with the data via excel ‘ctrl + f’.my friend from my workplace said that there is a structure called regex and it will be useful if I investigate.When I searched, I found a function called findall but it didn’t work.because when I made readlines, I actually made them a list.

Image for post
Image for post

I realized I found something wrong.I researched again.

On the one hand, I learned regex while researching.

this time I encountered a different error.It was working the way I wanted it to be, but it couldn’t print it out.This is because I had to group it when outputting this sounded ridiculous, but I still did it. I got what I wanted.

Image for post
Image for post‘level=\W+\w+’,i)

if level_search:

I wrote 4 lines of code, but a lot of time is gone.hard to find way I have found but it looks too small.Anyway I return to the topic. i decided i had to change what i found and print in a format.At first I thought of changing the values ​​and putting them into the format, but it didn’t. I tried hard but it never worked that way.

Image for post
Image for post

I’m here again after a long time.In order to organize it, I first need to clean up the pollution in the data.because the data I have is in ‘level=””traffic’ format.

I want to search and get only (=) the right side of the value I find. And I don’t want commas. I will use the split function too.

splitted_level= level.split(‘ “” ’)

I split it from commas. I did two operations in one line.I feel perfect.I can now use it as an index.

Image for post
Image for post

the way I used in a project I did before came to mind.I saved the data I would use as a array.Then I can store the data I split as an array.I did defined an empty array. first_format=[] I can add my data there.

My friend said you can use append.I learned this at school.first time, I use a function I learned in school.

first_format.append(splitted_level[1]+’,’) I used index [1] here.Because I had splitted data.and i learned that commas work when i use csv.I added commas next to them and I succeeded.

So what would I write to the value places I couldn’t find when I regexed?

Image for post
Image for post

I remembered using if.

else: first_format.append(‘1998,’)

I dont know how this came to my mind.I feel my algorithm thinking improving now. all that’s left is to rewrite this for every data I want and add for ‘for i in data:’

Now my format is ready.I can now replace the data it with the values ​​I want.

After a long research I learned that English is really necessary.because I could not find what I was looking for. Not used for replace list.Other than that,I find requires too much for.I rested my brain a little.

After long research I found a change algorithm.meets my needs without much confusion. find and replace multiple words in a file python .I added the things I want to change here. And i did.

I did now parsed my data.seemed like it would never end.I have reached what I have to do now.

Image for post
Image for post

I think my way is getting shorter.I start research now.

How do I use machine learning algorithm?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store