You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-3
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,14 @@
3
3
Feed Visualizer is a tool that can cluster RSS/Atom feed items based on semantic similarity and generate interactive visualization.
4
4
This tool can be used to generate 'semantic summary' of any website by reading it's RSS/Atom feed. Shown below is an image of how the visualization generated by Feed Visualizer looks like. If you like this tool please consider giving a ⭐ on github !
5
5
6
-

6
+

7
7
8
8
9
9
Interactive Demos:
10
+
11
+
* Visualization created from [NASA’s RSS Feed](https://www.nasa.gov/rss/dyn/breaking_news.rss) :
12
+
https://ashishware.com/static/nasa_viz.html
13
+
10
14
* Visualization created from [Martin Fowler's Atom Feed](https://martinfowler.com/feed.atom) :
@@ -49,10 +53,11 @@ Now, we need to create a config file for Feed Visualizer. The config file contai
49
53
"input_directory": "nasa",
50
54
"output_directory": "nasa_output",
51
55
"pretrained_model": "all-mpnet-base-v2",
52
-
"clust_dist_threshold":4,
56
+
"clust_dist_threshold":1,
53
57
"tsne_iter": 8000,
54
58
"text_max_length": 2048,
55
-
"topic_str_min_df": 0.25
59
+
"random_state": 45,
60
+
"topic_str_min_df": 0.20
56
61
}
57
62
```
58
63
@@ -79,6 +84,7 @@ Here is some information on what each config setting does:
79
84
"clust_dist_threshold": "Integer representing maximum radius of cluster. There is no correct value here. Experiment !",
80
85
"tsne_iter": "Integer representing number of iterations for TSNE (higher is better)",
81
86
"text_max_length": "Integer representing number of characters to read from content/description for semantic encoding.",
87
+
"random_state": "A integer to which serves as random seed while generating visualization. Use same random_state for reproducible results with set of data",
82
88
"topic_str_min_df": "A float. For example value of 0.25 means that only phrases which are present in 25% or more items in a cluster will be considered for being used as name of the cluster."
0 commit comments