UPDATE Multiple postgreSQL Table Records in Parellel

Unfortunately the RpostgreSQL package (I’m pretty sure other SQL DBs as well) doesn’t have a provision to UPDATE multiple records (say a whole data.frame) at once or allow placeholders making the UPDATE a one row at a time ordeal, so I built a work around hack to do the job in parellel.  The big problem was that you have to open and close the connections with every iteration or you will exceed max connections since it goes through every row.

First the function for connecting, updating, and closing the DB:

update <- function(i) {
    drv <- dbDriver("PostgreSQL")
    con <- dbConnect(drv, dbname="db_name", host="localhost", port="5432", user="chris", password="password")
    txt <- paste("UPDATE data SET column_one=",data$column_one[i],",column_two=",data$column_two[i]," where id=",data$id[i])
    dbGetQuery(con, txt)

Then run the query:



foreach(i = 1:length(data$column_one), .inorder=FALSE,.packages="RPostgreSQL")%dopar%{


3 thoughts on “UPDATE Multiple postgreSQL Table Records in Parellel

  1. I was able to something similar with one line of apply. In your case it would like:
    Assuming your date frame has the following columns:
    column_one, column_two, id

    apply(data, 1, function(x) { dbQetQuery(con, paste("UPDATE data SET column_one =", x[1], ", colunm_two = ", x[2], " where id = ", x[3]) ) } )

    Apply itereates over every row in the data frame and supplys a vector of that row to the function as x. The vector is than accessed as x[1], x[2], x[3] in the paste function to create the sql update statement.

