Evaluating Tabular Extracts
Can tabular extracts be readily compared to text extracts?
- sentence-selection-based evaluation not applicable
- style / fluency metric not applicable
- task or query-based evaluation could work
- is length just number of words in table?