目前,该程序将运行一列URL并将所选数据输出到相邻单元。我可以设置它开始的列,但这就是我所能做的。现在,我只在一列上工作。我怎样才能指示它说第4栏(E栏)并在第0栏(A)通过后自上而下?然后可能是另一个,在那之后说专栏J?
我相信我的问题在于“while(!(cell = sheet.getCell ...”行),但我不确定在不破坏程序的情况下要改变什么。
我的代码如下:
public class App {
private static final int URL_COLUMN = 0; // Column A
private static final int PRICE_COLUMN = 1; //Column B
public static void main(final String[] args) throws Exception {
Workbook originalWorkbook = Workbook.getWorkbook(new File("C:/Users/Shadow/Desktop/original.xls"));
WritableWorkbook workbook = Workbook.createWorkbook(new File("C:/Users/Shadow/Desktop/updated.xls"), originalWorkbook);
originalWorkbook.close();
WritableSheet sheet = workbook.getSheet(0);
int currentRow = 1;
Cell cell;
while (!(cell = sheet.getCell(URL_COLUMN, currentRow)).getType().equals(CellType.EMPTY)) {
String url = cell.getContents();
System.out.println("Checking URL: " + url);
if (url.contains("scrapingsite1.com")) {
String Price = ScrapingSite1(url);
System.out.println("Scraping Site1's Price: " + Price);
Label cellWithPrice = new Label(PRICE_COLUMN, currentRow, Price);
sheet.addCell(cellWithPrice);
}
currentRow++;
}
workbook.write();
workbook.close();
}
private static String ScrapingSite1 (String url) throws IOException {
Document doc = null;
for (int i=1; i <= 6; i++) {
try {
doc = Jsoup.connect(url).userAgent("Mozilla/5.0").timeout(6000).validateTLSCertificates(false).get();
break;
} catch (IOException e) {
System.out.println("Jsoup issue occurred " + i + " time(s).");
}
}
if (doc == null){
return null;
}
else{
return doc.select("p.price").text();
}
}
}
为了简化代码,我假设价格总是到下一列(+1)。
另外,要处理几列而不是使用单值int URL_COLUMN = 0
,我将其替换为要处理的列数组:int[] URL_COLUMNS = { 0, 4, 9 }; // Columns A, E, J
。
然后,您可以遍历每个列{0, 4, 9}
并将数据保存到下一列{1, 5, 10}
。
private static final int[] URL_COLUMNS = { 0, 4, 9 }; // Columns A, E, J
public static void main(final String[] args) throws Exception {
Workbook originalWorkbook = Workbook.getWorkbook(new File("C:/Users/Shadow/Desktop/original.xls"));
WritableWorkbook workbook = Workbook.createWorkbook(new File("C:/Users/Shadow/Desktop/updated.xls"), originalWorkbook);
originalWorkbook.close();
WritableSheet sheet = workbook.getSheet(0);
Cell cell;
// loop over every column
for (int i = 0; i < URL_COLUMNS.length; i++) {
int currentRow = 1;
while (!(cell = sheet.getCell(URL_COLUMNS[i], currentRow)).getType().equals(CellType.EMPTY)) {
String url = cell.getContents();
System.out.println("Checking URL: " + url);
if (url.contains("scrapingsite1.com")) {
String Price = ScrapingSite1(url);
System.out.println("Scraping Site1's Price: " + Price);
// save price into the next column
Label cellWithPrice = new Label(URL_COLUMNS[i] + 1, currentRow, Price);
sheet.addCell(cellWithPrice);
}
currentRow++;
}
}
workbook.write();
workbook.close();
}