Python, Selenium, CSV, and UTF-8 (French) characters -
i have csv file contains french words, such "immédiatement". i'm using python plus selenium webdriver write words text field. basically, using required selenium packages plus csv:
- start selenium , go correct area.
- open csv file.
- for each row:
- get cell contains french word.
- write word in textarea.
the problem:
"unicodedecodeerror: 'utf8' codec can't decode byte 0x82 in position 3: invalid start byte"
i've tried:
- declaring "coding: utf-8" @ top of file, , leaving out
- once set variable contents of cell, appending .decode("utf-8")
- once set variable contents of cell, appending .encode("utf-8")
no love.
(i can't set "ignore" or "replace", because need type word out. doesn't appear selenium itself, because when put list directly in script, typing goes fine. (i put in dict in script, jesus, why.))
what missing?
[edit] sample csv content:
3351,payé/effectué,link1 45922,plannifié,link1 3693,honoraires par produit,link2
and generalised code:
# -*- coding: utf-8 -*- selenium import webdriver selenium.webdriver.common.by import selenium.webdriver.common.keys import keys import unittest, time, re, csv csvdoc = "c:\path\to\sample.csv" class translations(unittest.testcase): def setup(self): self.driver = webdriver.firefox() self.driver.implicitly_wait(30) self.base_url = "https://baseurl.com/" self.verificationerrors = [] self.accept_next_alert = true def test_translations(self): driver = self.driver driver.get(self.base_url + "login") driver.find_element_by_id("txtusername").clear() driver.find_element_by_id("txtusername").send_keys("username") driver.find_element_by_id("txtpassword").clear() driver.find_element_by_id("txtpassword").send_keys("password") driver.find_element_by_id("btnsubmit").click() # navigate correct area. # - code goes here - # open file , started. open(csvdoc, 'r') csvfile: csvreader = csv.reader(csvfile, delimiter=',', quotechar='"') row in csvreader: elmid = row[0] phrase = row[1] arealink = row[2] driver.find_element_by_xpath("//a[text()='%s']" % arealink).click() time.sleep(1) driver.find_element_by_id(elmid).clear() driver.find_element_by_id(elmid).send_keys(phrase) driver.find_element_by_id("btnsavephrase").click() def is_element_present(self, how, what): try: self.driver.find_element(by=how, value=what) except nosuchelementexception, e: return false return true def teardown(self): self.driver.quit() self.assertequal([], self.verificationerrors) if __name__ == "__main__": unittest.main()
after several hours of trying, found way this, had move away csv.reader. problem facing classic python byte-string-vs unicode-string-problem. not fluent in python unicode vs byte strings yet, , csv.reader used kind of encoding in background not figure out. however:
from selenium.webdriver.chrome.webdriver import webdriver import io csvdoc = "your_path/file.csv" driver = webdriver("your_path/chromedriver.exe") driver.get("http://google.com") element = driver.find_element_by_id("lst-ib") io.open(csvdoc, 'r') csvfile: csvcontent = csvfile.read() print(csvcontent) l in csvcontent.splitlines(): line = l.split(',') element.send_keys(line[0]) element.send_keys(line[1]) element.send_keys(line[2])
when chose fetch contents of file without csv.reader
, able predictable string work with. matter of splitting in right loops. finally, strings accepted seleniums send_keys()
-method.
i changed open() as
io.open() as
. able include encoding value third parameter (i using python 2.7). when removed third parameter, script still worked, removing io.
did not work.
i know primitive way of solving problem, @ least works, , answer.
Comments
Post a Comment