Skip to content

Change github download url list filename #23

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

MatheMatrix
Copy link

@MatheMatrix MatheMatrix commented Nov 3, 2021

The filename is far from the actual url list:

original name |  lines    | new name
----------------------------------------
c100.txt      |  6306     | c1000.txt
cpp100.txt    |  13915    | cpp10000.txt
java10.txt    |  73783    | java10000.txt

@MatheMatrix MatheMatrix force-pushed the change-github-url-list-name branch 2 times, most recently from 4f2a380 to f3a55e9 Compare November 3, 2021 05:48
The filename is far from the actual url list:

original name |  lines    | new name
----------------------------------------
c100.txt      |  6306     | c1000.txt
cpp100.txt    |  13915    | cpp10000.txt
java10.txt    |  73783    | java10000.txt
@MatheMatrix MatheMatrix force-pushed the change-github-url-list-name branch from f3a55e9 to e3b8d51 Compare November 3, 2021 05:49
@jgottschlich jgottschlich requested a review from nhasabni November 3, 2021 16:39
@jgottschlich
Copy link
Contributor

Seems reasonable to me. Adding @nhasabni for his approval, too.

@jgottschlich
Copy link
Contributor

Looks reasonable to me; approved. Waiting for @nhasabni to approve as well.

Thank you @MatheMatrix for your correction and contribution!

Justin

@nhasabni
Copy link
Contributor

nhasabni commented Nov 3, 2021

hi @MatheMatrix,

Thanks for the PR. Suffix 100 in the name c100.txt is for the number of GitHub stars that were used to obtain the repositories for mining. So, IMO, name c1000.txt would be misleading. That being said, I think the description in README could be corrected. Would you be interested in correcting the description in README? Thanks.


Steps below show how to download Top-100 GitHub repos for C language
(`c100.txt`) and generate training data. `training_repo_dir` is a directory
Steps below show how to download Top-1000 GitHub repos for C language
Copy link
Contributor

@nhasabni nhasabni Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a better description - Steps below show how to download GitHub repositories having at least 100 GitHub stars for C language (c100.txt). What do you think?

And in that case, the file name change is not needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. Thanks for clarifying @nhasabni. That sounds rights to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants