Tf-idf

Closed
informaticienne - Apr 6, 2016 at 06:53 PM
 Blocked Profile - Apr 11, 2016 at 05:10 PM
Hello,


Hello,
i will implement TF-IDF : i want to calcul TF-idf of the txt file
i don't know what i should modify in this code



package ibtissem;
import java.util.Arrays;
import java.util.List;
public class exer {
private static tfidfCalcultor calculator;
/**

@param doc list of strings
@param term String represents a term
@return term frequency of term in document
/


public static double tf(List<String> doc, String term) {
double result = 0;
for (String word : doc) {
if (term.equalsIgnoreCase(word))
result++;
}
return result / doc.size();
}
/**

@param docs list of list of strings represents the dataset
@param term String represents a term
@return the inverse term frequency of term in documents
/


public static double idf(List<List<String>> docs, String term) {
double n = 0;
for (List<String> doc : docs) {
for (String word : doc) {
if (term.equalsIgnoreCase(word)) {
n++;
break;
}
}
}
return Math.log(docs.size() / n);
}
/**

@param doc a text document
@param docs all documents
@param term term
@return the TF-IDF of term
/


public static double tfIdf(List<String> doc, List<List<String>> docs, String term) {
return tf(doc, term) * idf(docs, term);
}
public static void main(String[] args) {
List<String> doc1 = Arrays.asList("Lorem", "ipsum", "dolor", "ipsum", "sit", "ipsum");
List<String> doc2 = Arrays.asList("Vituperata", "incorrupte", "at", "ipsum", "pro", "quo");
List<String> doc3 = Arrays.asList("Has", "persius", "disputationi", "id", "simul","lorem");
List<List<String>> documents = Arrays.asList(doc1, doc2, doc3);
calculator = new tfidfCalcultor();
double tfidf = tfIdf(doc1, documents, "lorem") ;
System.out.println("TF-IDF (lorem) = " + tfidf) ;
}
}


3 replies

Blocked Profile
Apr 6, 2016 at 07:11 PM
So, what language are you compiling this in? Where is it running?
0
informaticienne
Apr 7, 2016 at 05:55 AM
language is java :o :o
0
Blocked Profile
Apr 7, 2016 at 04:24 PM
So, what are you expecting from the result?
0
informaticienne
Apr 7, 2016 at 05:01 PM
the TF-iDF of file text
0
Blocked Profile
Apr 8, 2016 at 04:18 PM
So what are you getting as a result? A better question would be:where is the file located that you are calculating? I do not see a file referenced in the code(no relative, absolute or UNC path!)? Where did you cut this code from?

Also, do you have the package you are referencing unpaked and accessible?

I have said it once, I will say it again. IT!
0
informaticienne
Apr 9, 2016 at 05:49 AM
this is just un example for TF-IDF but idon't know how tu use text file to calculate TF-IDF of text file
any help please thank you
0
Blocked Profile
Apr 11, 2016 at 05:10 PM
Well, the fact that you cannot verify if the package is available to you, I cannot help any further. I could sit here all day and tell you how to implement your code, but if you are referencing packages that you don't have access to, it isn't worth it. From all of the other EXAMPLES from the internet, it looks like you need two other packages! How long have you been using programming languages?
0