如何使用Seaborn(或matplotlib)在x轴上绘制日期

时间:2021-12-25 19:35:29

I have a csv file with some time series data. I create a Data Frame as such:

我有一个带有一些时间序列数据的csv文件。我这样创建一个数据框:

df = pd.read_csv('C:\\Desktop\\Scripts\\TimeSeries.log')

When I call df.head(6), the data appears as follows:

当我调用df.head(6)时,数据显示如下:

Company     Date                 Value
ABC         08/21/16 00:00:00    500
ABC         08/22/16 00:00:00    600
ABC         08/23/16 00:00:00    650
ABC         08/24/16 00:00:00    625
ABC         08/25/16 00:00:00    675
ABC         08/26/16 00:00:00    680

Then, I have the following to force the 'Date' column into datetime format:

然后,我将以下内容强制将“日期”列强制为日期时间格式:

df['Date'] = pd.to_datetime(df['Date'], errors = 'coerce')

Interestingly, I see "pandas.core.series.Series" when I call the following:

有趣的是,当我调用以下内容时,我会看到“pandas.core.series.Series”:

type(df['Date'])

Finally, I call the following to create a plot:

最后,我打电话给以下人员创建一个情节:

%matplotlib qt
sns.tsplot(df['Value'])

On the x-axis from left to right, I see integers ranging from 0 to the number of rows in the data frame. How does one add the 'Date' column as the x-axis values to this plot?

在从左到右的x轴上,我看到从0到数据帧中的行数的整数。如何将“日期”列添加为此图的x轴值?

Thanks!

谢谢!

3 个解决方案

#1


5  

Not sure that tsplot is the best tool for that. You can just use:

不确定tsplot是最好的工具。你可以使用:

df[['Date','Value']].set_index('Date').plot()

#2


2  

use the time parameter for tsplot

使用tsplot的time参数

from docs:

来自docs:

time : string or series-like
    Either the name of the field corresponding to time in the data DataFrame or x values for a plot when data is an array. If a Series, the name will be used to label the x axis.
#Plot the Value column against Date column
sns.tsplot(data = df['Value'], time = df['Date'])

However tsplot is used to plot timeseries in the same time window and differnet conditions. To plot a single timeseries you could also use plt.plot(time = df['Date'], data = df['Value'])

但是,tsplot用于在同一时间窗口和不同条件下绘制时间序列。要绘制单个时间序列,您还可以使用plt.plot(time = df ['Date'],data = df ['Value'])

#3


0  

I think it is too late.

我认为为时已晚。

First, you have to notice that 'Date' column is a series of 'datetime' type so you should do that to get the 'date' part:

首先,您必须注意“日期”列是一系列“日期时间”类型,因此您应该这样做以获取“日期”部分:

df['Date'] = df['Date'].map(lambda x:x.date())

now group your data frame by 'Date' and then reset index in order to make 'Date' a column (not an index).

现在按“日期”对数据框进行分组,然后重置索引,以使“日期”成为列(而不是索引)。

Then you can use plt.plot_date

然后你可以使用plt.plot_date

df_groupedby_date = df.groupby('Date').count()
df_groupedby_date.reset_index(inplace=True)
plt.plot_date(x=df_groupedby_date['Date'], y=df_groupedby_date['Value'])

#1


5  

Not sure that tsplot is the best tool for that. You can just use:

不确定tsplot是最好的工具。你可以使用:

df[['Date','Value']].set_index('Date').plot()

#2


2  

use the time parameter for tsplot

使用tsplot的time参数

from docs:

来自docs:

time : string or series-like
    Either the name of the field corresponding to time in the data DataFrame or x values for a plot when data is an array. If a Series, the name will be used to label the x axis.
#Plot the Value column against Date column
sns.tsplot(data = df['Value'], time = df['Date'])

However tsplot is used to plot timeseries in the same time window and differnet conditions. To plot a single timeseries you could also use plt.plot(time = df['Date'], data = df['Value'])

但是,tsplot用于在同一时间窗口和不同条件下绘制时间序列。要绘制单个时间序列,您还可以使用plt.plot(time = df ['Date'],data = df ['Value'])

#3


0  

I think it is too late.

我认为为时已晚。

First, you have to notice that 'Date' column is a series of 'datetime' type so you should do that to get the 'date' part:

首先,您必须注意“日期”列是一系列“日期时间”类型,因此您应该这样做以获取“日期”部分:

df['Date'] = df['Date'].map(lambda x:x.date())

now group your data frame by 'Date' and then reset index in order to make 'Date' a column (not an index).

现在按“日期”对数据框进行分组,然后重置索引,以使“日期”成为列(而不是索引)。

Then you can use plt.plot_date

然后你可以使用plt.plot_date

df_groupedby_date = df.groupby('Date').count()
df_groupedby_date.reset_index(inplace=True)
plt.plot_date(x=df_groupedby_date['Date'], y=df_groupedby_date['Value'])