-
pandas.merge_asof(left, right, on=None, left_on=None, right_on=None, left_index=False, right_index=False, by=None, left_by=None, right_by=None, suffixes=('_x', '_y'), tolerance=None, allow_exact_matches=True)
[source] -
Perform an asof merge. This is similar to a left-join except that we match on nearest key rather than equal keys.
For each row in the left DataFrame, we select the last row in the right DataFrame whose ?on? key is less than or equal to the left?s key. Both DataFrames must be sorted by the key.
Optionally match on equivalent keys with ?by? before searching for nearest match with ?on?.
New in version 0.19.0.
Parameters: left : DataFrame
right : DataFrame
on : label
Field name to join on. Must be found in both DataFrames. The data MUST be ordered. Furthermore this must be a numeric column, such as datetimelike, integer, or float. On or left_on/right_on must be given.
left_on : label
Field name to join on in left DataFrame.
right_on : label
Field name to join on in right DataFrame.
left_index : boolean
Use the index of the left DataFrame as the join key.
New in version 0.19.2.
right_index : boolean
Use the index of the right DataFrame as the join key.
New in version 0.19.2.
by : column name or list of column names
Match on these columns before performing merge operation.
left_by : column name
Field names to match on in the left DataFrame.
New in version 0.19.2.
right_by : column name
Field names to match on in the right DataFrame.
New in version 0.19.2.
suffixes : 2-length sequence (tuple, list, ...)
Suffix to apply to overlapping column names in the left and right side, respectively
tolerance : integer or Timedelta, optional, default None
select asof tolerance within this range; must be compatible to the merge index.
allow_exact_matches : boolean, default True
- If True, allow matching the same ?on? value (i.e. less-than-or-equal-to)
- If False, don?t match the same ?on? value (i.e., stricly less-than)
Returns: merged : DataFrame
See also
Examples
12345>>> left
a left_val
0
1
a
1
5
b
2
10
c
1234567>>> right
a right_val
0
1
1
1
2
2
2
3
3
3
6
6
4
7
7
12345>>> pd.merge_asof(left, right, on
=
'a'
)
a left_val right_val
0
1
a
1
1
5
b
3
2
10
c
7
12345>>> pd.merge_asof(left, right, on
=
'a'
, allow_exact_matches
=
False
)
a left_val right_val
0
1
a NaN
1
5
b
3.0
2
10
c
7.0
For this example, we can achieve a similar result thru
pd.merge_ordered()
, though its not nearly as performant.12345678>>> (pd.merge_ordered(left, right, on
=
'a'
)
... .ffill()
... .drop_duplicates([
'left_val'
])
... )
a left_val right_val
0
1
a
1.0
3
5
b
3.0
6
10
c
7.0
We can use indexed DataFrames as well.
12345>>> left
left_val
1
a
5
b
10
c
1234567>>> right
right_val
1
1
2
2
3
3
6
6
7
7
12345>>> pd.merge_asof(left, right, left_index
=
True
, right_index
=
True
)
left_val right_val
1
a
1
5
b
3
10
c
7
Here is a real-world times-series example
12345678910>>> quotes
time ticker bid ask
0
2016
-
05
-
25
13
:
30
:
00.023
GOOG
720.50
720.93
1
2016
-
05
-
25
13
:
30
:
00.023
MSFT
51.95
51.96
2
2016
-
05
-
25
13
:
30
:
00.030
MSFT
51.97
51.98
3
2016
-
05
-
25
13
:
30
:
00.041
MSFT
51.99
52.00
4
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.50
720.93
5
2016
-
05
-
25
13
:
30
:
00.049
AAPL
97.99
98.01
6
2016
-
05
-
25
13
:
30
:
00.072
GOOG
720.50
720.88
7
2016
-
05
-
25
13
:
30
:
00.075
MSFT
52.01
52.03
1234567>>> trades
time ticker price quantity
0
2016
-
05
-
25
13
:
30
:
00.023
MSFT
51.95
75
1
2016
-
05
-
25
13
:
30
:
00.038
MSFT
51.95
155
2
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.77
100
3
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.92
100
4
2016
-
05
-
25
13
:
30
:
00.048
AAPL
98.00
100
By default we are taking the asof of the quotes
123456789>>> pd.merge_asof(trades, quotes,
... on
=
'time'
,
... by
=
'ticker'
)
time ticker price quantity bid ask
0
2016
-
05
-
25
13
:
30
:
00.023
MSFT
51.95
75
51.95
51.96
1
2016
-
05
-
25
13
:
30
:
00.038
MSFT
51.95
155
51.97
51.98
2
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.77
100
720.50
720.93
3
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.92
100
720.50
720.93
4
2016
-
05
-
25
13
:
30
:
00.048
AAPL
98.00
100
NaN NaN
We only asof within 2ms betwen the quote time and the trade time
12345678910>>> pd.merge_asof(trades, quotes,
... on
=
'time'
,
... by
=
'ticker'
,
... tolerance
=
pd.Timedelta(
'2ms'
))
time ticker price quantity bid ask
0
2016
-
05
-
25
13
:
30
:
00.023
MSFT
51.95
75
51.95
51.96
1
2016
-
05
-
25
13
:
30
:
00.038
MSFT
51.95
155
NaN NaN
2
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.77
100
720.50
720.93
3
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.92
100
720.50
720.93
4
2016
-
05
-
25
13
:
30
:
00.048
AAPL
98.00
100
NaN NaN
We only asof within 10ms betwen the quote time and the trade time and we exclude exact matches on time. However prior data will propogate forward
1234567891011>>> pd.merge_asof(trades, quotes,
... on
=
'time'
,
... by
=
'ticker'
,
... tolerance
=
pd.Timedelta(
'10ms'
),
... allow_exact_matches
=
False
)
time ticker price quantity bid ask
0
2016
-
05
-
25
13
:
30
:
00.023
MSFT
51.95
75
NaN NaN
1
2016
-
05
-
25
13
:
30
:
00.038
MSFT
51.95
155
51.97
51.98
2
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.77
100
720.50
720.93
3
2016
-
05
-
25
13
:
30
:
00.048
GOOG
720.92
100
720.50
720.93
4
2016
-
05
-
25
13
:
30
:
00.048
AAPL
98.00
100
NaN NaN
pandas.merge_asof()

2025-01-10 15:47:30
Please login to continue.