I have a csv file with some time series data. I create a Data Frame as such:
我有一个带有一些时间序列数据的csv文件。我这样创建一个数据框:
df = pd.read_csv('C:\\Desktop\\Scripts\\TimeSeries.log')
When I call df.head(6)
, the data appears as follows:
当我调用df.head(6)时,数据显示如下:
Company Date Value
ABC 08/21/16 00:00:00 500
ABC 08/22/16 00:00:00 600
ABC 08/23/16 00:00:00 650
ABC 08/24/16 00:00:00 625
ABC 08/25/16 00:00:00 675
ABC 08/26/16 00:00:00 680
Then, I have the following to force the 'Date' column into datetime format:
然后,我将以下内容强制将“日期”列强制为日期时间格式:
df['Date'] = pd.to_datetime(df['Date'], errors = 'coerce')
Interestingly, I see "pandas.core.series.Series
" when I call the following:
有趣的是,当我调用以下内容时,我会看到“pandas.core.series.Series”:
type(df['Date'])
Finally, I call the following to create a plot:
最后,我打电话给以下人员创建一个情节:
%matplotlib qt
sns.tsplot(df['Value'])
On the x-axis from left to right, I see integers ranging from 0 to the number of rows in the data frame. How does one add the 'Date' column as the x-axis values to this plot?
在从左到右的x轴上,我看到从0到数据帧中的行数的整数。如何将“日期”列添加为此图的x轴值?
Thanks!
谢谢!
3 个解决方案
#1
5
Not sure that tsplot is the best tool for that. You can just use:
不确定tsplot是最好的工具。你可以使用:
df[['Date','Value']].set_index('Date').plot()
#2
2
use the time
parameter for tsplot
使用tsplot的time参数
from docs:
来自docs:
time : string or series-like
Either the name of the field corresponding to time in the data DataFrame or x values for a plot when data is an array. If a Series, the name will be used to label the x axis.
#Plot the Value column against Date column
sns.tsplot(data = df['Value'], time = df['Date'])
However tsplot
is used to plot timeseries in the same time window and differnet conditions. To plot a single timeseries you could also use plt.plot(time = df['Date'], data = df['Value'])
但是,tsplot用于在同一时间窗口和不同条件下绘制时间序列。要绘制单个时间序列,您还可以使用plt.plot(time = df ['Date'],data = df ['Value'])
#3
0
I think it is too late.
我认为为时已晚。
First, you have to notice that 'Date' column is a series of 'datetime' type so you should do that to get the 'date' part:
首先,您必须注意“日期”列是一系列“日期时间”类型,因此您应该这样做以获取“日期”部分:
df['Date'] = df['Date'].map(lambda x:x.date())
now group your data frame by 'Date' and then reset index in order to make 'Date' a column (not an index).
现在按“日期”对数据框进行分组,然后重置索引,以使“日期”成为列(而不是索引)。
Then you can use plt.plot_date
然后你可以使用plt.plot_date
df_groupedby_date = df.groupby('Date').count()
df_groupedby_date.reset_index(inplace=True)
plt.plot_date(x=df_groupedby_date['Date'], y=df_groupedby_date['Value'])
#1
5
Not sure that tsplot is the best tool for that. You can just use:
不确定tsplot是最好的工具。你可以使用:
df[['Date','Value']].set_index('Date').plot()
#2
2
use the time
parameter for tsplot
使用tsplot的time参数
from docs:
来自docs:
time : string or series-like
Either the name of the field corresponding to time in the data DataFrame or x values for a plot when data is an array. If a Series, the name will be used to label the x axis.
#Plot the Value column against Date column
sns.tsplot(data = df['Value'], time = df['Date'])
However tsplot
is used to plot timeseries in the same time window and differnet conditions. To plot a single timeseries you could also use plt.plot(time = df['Date'], data = df['Value'])
但是,tsplot用于在同一时间窗口和不同条件下绘制时间序列。要绘制单个时间序列,您还可以使用plt.plot(time = df ['Date'],data = df ['Value'])
#3
0
I think it is too late.
我认为为时已晚。
First, you have to notice that 'Date' column is a series of 'datetime' type so you should do that to get the 'date' part:
首先,您必须注意“日期”列是一系列“日期时间”类型,因此您应该这样做以获取“日期”部分:
df['Date'] = df['Date'].map(lambda x:x.date())
now group your data frame by 'Date' and then reset index in order to make 'Date' a column (not an index).
现在按“日期”对数据框进行分组,然后重置索引,以使“日期”成为列(而不是索引)。
Then you can use plt.plot_date
然后你可以使用plt.plot_date
df_groupedby_date = df.groupby('Date').count()
df_groupedby_date.reset_index(inplace=True)
plt.plot_date(x=df_groupedby_date['Date'], y=df_groupedby_date['Value'])