如何向 CSV 文件添加新列?
- 2025-01-22 08:45:00
- admin 原创
- 97
问题描述:
我有几个如下所示的CSV文件:
Input
Name Code
blackberry 1
wineberry 2
rasberry 1
blueberry 1
mulberry 2
我想向所有 CSV 文件添加一个新列,使其看起来像这样:
Output
Name Code Berry
blackberry 1 blackberry
wineberry 2 wineberry
rasberry 1 rasberry
blueberry 1 blueberry
mulberry 2 mulberry
我目前拥有的脚本是这样的:
import csv
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+['Berry'])
(Python 3.2)
但在输出中,脚本跳过每一行,新列中只有 Berry:
Output
Name Code Berry
blackberry 1 Berry
wineberry 2 Berry
rasberry 1 Berry
blueberry 1 Berry
mulberry 2 Berry
解决方案 1:
这应该能让你了解该怎么做:
>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
... item.append(item[0])
... print item
...
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>>
编辑,注意在 py3k 中你必须使用next(r)
感谢您接受答案。这里有一个奖励(您的工作脚本):
import csv
with open('C:/test/test.csv','r') as csvinput:
with open('C:/test/output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='
')
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('Berry')
all.append(row)
for row in reader:
row.append(row[0])
all.append(row)
writer.writerows(all)
请注意
lineterminator
中的参数。csv.writer
默认情况下,它设置为`'
'`,这就是为什么你有双倍行距。使用列表追加所有行并一次性写入
writerows
。如果你的文件非常大,这可能不是一个好主意(RAM),但对于普通文件,我认为它更快,因为 I/O 更少。正如这篇文章的评论所指出的,请注意,
with
您可以在同一行中执行这两个语句,而不是嵌套这两个语句:
使用 open('C:/test/test.csv','r') 作为 csvinput,使用 open('C:/test/output.csv','w') 作为 csvoutput:
解决方案 2:
我很惊讶没有人推荐 Pandas。虽然使用一组像 Pandas 这样的依赖项似乎比完成这样一项简单的任务所需的更麻烦,但它会生成一个非常短的脚本,而且 Pandas 是一个很棒的库,可以执行各种 CSV(实际上是所有数据类型)数据操作。4 行代码无可争议:
import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)
请访问熊猫网站以了解更多信息!
内容output.csv
:
Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry
解决方案 3:
import csv
with open('input.csv','r') as csvinput:
with open('output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
if row[0] == "Name":
writer.writerow(row+["Berry"])
else:
writer.writerow(row+[row[0]])
也许这就是你想要的?
另外,csv 代表逗号分隔值。因此,我认为你需要用逗号来分隔你的值,就像这样:
Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2
解决方案 4:
是的这是一个老问题但可能会对某些人有帮助
import csv
import uuid
# read and write csv files
with open('in_file','r') as r_csvfile:
with open('out_file','w',newline='') as w_csvfile:
dict_reader = csv.DictReader(r_csvfile,delimiter='|')
#add new column with existing
fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
writer_csv.writeheader()
for row in dict_reader:
row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
writer_csv.writerow(row)
解决方案 5:
我使用了 pandas,它运行良好...在使用它时,我必须打开一个文件并向其中添加一些随机列,然后仅保存回同一个文件。
此代码添加了多个列条目,您可以根据需要进行编辑。
import pandas as pd
csv_input = pd.read_csv('testcase.csv') #reading my csv file
csv_input['Phone1'] = csv_input['Name'] #this would also copy the cell value
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False) #this writes back to your file
如果您希望单元格值不被复制,那么首先在您的 csv 文件中手动创建一个空列,就像您将其命名为Hours一样
,现在您可以在上面的代码中添加此行,
csv_input['New Value'] = csv_input['Hours']
或者简单地说,我们可以不添加手动列,我们可以
csv_input['New Value'] = '' #simple and easy
我希望它能有所帮助。
解决方案 6:
你可以这样写:
import pandas as pd
import csv
df = pd.read_csv('csv_name.csv')
df['Berry'] = df['Name']
df.to_csv("csv_name.csv",index=False)
然后你就完成了。要检查它,你可以运行:
h = pd.read_csv('csv_name.csv')
print(h)
如果您想添加一列包含任意新元素(a,b,c),您可以用以下方式替换代码的第 4 行:
df['Berry'] = ['a','b','c']
解决方案 7:
该代码可以满足您的要求,并且我已经对示例代码进行了测试。
import csv
with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
csv_reader = csv.reader(f_in, delimiter=';')
writer = csv.writer(f_out)
for row in csv_reader:
writer.writerow(row + [row[0]]
解决方案 8:
为了向现有 CSV 文件(带标题)添加新列,如果要添加的列的值数量足够少,这里有一个方便的函数(有点类似于@joaquin 的解决方案)。该函数采用
现有的 CSV 文件名
输出 CSV 文件名(将包含更新的内容)和
包含标题名称和列值的列表
def add_col_to_csv(csvfile,fileout,new_list):
with open(csvfile, 'r') as read_f, \n open(fileout, 'w', newline='') as write_f:
csv_reader = csv.reader(read_f)
csv_writer = csv.writer(write_f)
i = 0
for row in csv_reader:
row.append(new_list[i])
csv_writer.writerow(row)
i += 1
例子:
new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)
现有的 CSV 文件:
输出(更新后的)CSV 文件:
解决方案 9:
我看不到您在哪里添加新列,但是请尝试以下操作:
import csv
i = 0
Berry = open("newcolumn.csv","r").readlines()
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+","+Berry[i])
i++
解决方案 10:
如果文件很大,您可以使用允许读取每个块的数据集的参数pandas.read_csv
:chunksize
import pandas as pd
INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
如果要就地更新文件,可以使用临时文件并在最后将其删除
import pandas as pd
INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
os.replace(TMP_CSV, INPUT_CSV)
解决方案 11:
使用python在现有csv文件中附加新列而不使用标题名称
default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
with open('problem-one-answer.csv', 'r') as read_obj, \n open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
for row in csv_reader:
# Append the default text in the row / list
row.append(default_text)
# Add the updated row / list to the output file
csv_writer.writerow(row)
谢谢