Previously, the view contents could become inconsistent with the base tables
in the following scenarios:
1) A concurrent transaction modifies a base table and commits before the
incremental view maintenance starts in the current transaction.
2) A concurrent transaction modifies a base table and commits before the
create_immv or refresh_immv command generates data.
3) Concurrent transactions incrementally update a view with a self-join
or modify multiple base tables simultaneously.
Incremental updates of a view are generally performed sequentially using an
exclusive lock. However, even if we are able to acquire the lock, a concurrent
transaction may have already incrementally updated the view and been committed
before we can acquire it. In REPEATABLE READ or SERIALIZABLE isolation levels,
this could lead to an inconsistent view state, which is the cause of the first
issue.
To fix this, a new field, lastivmupdate, has been added to the pg_ivm_immv
catalog to record the transaction ID of the most recent update to the view.
Before performing view maintenance, the transaction ID is checked. If the
transaction was still in progress at the start of the current transaction,
an error is raised to prevent anomalies.
To fix the second issue, the timing of CreateTrigger() has been moved to
before data generation. This ensures that locks conflicting with table
modifications have been acquired on all base tables. In addition, the latest
snapshot is used in READ COMMITTED level during the data generation to reflect
committed changes from concurrent transactions. Additionally, inconsistencies
that cannot be avoided through locking are prevented by checking the transaction
ID of the last view update, as done for the first issue.
However, concurrent table modifications and create_immv execution still cannot
be detected at the time of view creation. Therefore, create_immv raises a warning
in REPEATABLE READ or SERIALIZABLE isolation levels, suggesting that the command
be used in READ COMMITTED mode or that refresh_immv be executed afterward to
ensure the view remains consistent.
The third issue was caused by the snapshot used for checking tuple visibility in
the table's pre-update state not being the latest one. To fix this, the latest
snapshot is now used in READ COMMITTED mode.
Isolation tests are also added.
Issue #104
Previously, pg_upgrade failed due to the permission denied
because the pg_ivm_immv catalog was in the pg_catalog catalog
(Issue #79). To fix this, all objects created by pg_ivm are
moved to theschema pgivm, which is also created by pg_ivm.
pg_ivm is still not relocatable and this must be installed
to the pgivm schema because the catalog and some internal
functions are referred to unqualified by the schema name
from the pg_ivm module. In future, this might be able to
relocatable during installation, though.
This commit affects compatibility with previous releases.
To allow to access objects like create_immv function as
previous, you need to qualify them with the schema name
or setup search_path properly.
Compilation errors and warning are fixed.
The design of create_immv is also chaned as similar to PG17, that is,
firstly a relation is created without data then it is populated
by using the refresh logic.
This commit contains the following changes:
- Change functions to use a safe search_path during maintenance operations
when used with PostgreSQL 17
This prevents maintenance operations (automatic maintenance of IMMVs and
refresh_immv) from performing unsafe access. Functions used by IMMVs that
need to reference non-default schemas must specify a search path during
function creation.
- refresh_immv can be executed by users with the MAINTAIN privilege
when used with PostgreSQL 17
Issue #90
Build errors/warnings against PostgreSQL 16 are fixed.
Also, adapted to the change of codes, including:
- Get rid of the "new" and "old" entries in a view's rangetable.
(Although, removed codes were dead codes because pg_ivm doesn't
have any rules in pg_rewrite.)
- Rework query relation permission checking
- Require empty Bitmapsets to be represented as NULL
- Fix some compiler warnings
Add EXISTS clause support in IVM
Correlated subqueries using EXISTS in WHERE clause are supported.
An EXISTS subquery in WHERE clause is rewritten to LATERAL subquery
in FROM clause, and IVM' process can handle this like as a normal join.
Also, hidden columns "ivm_exists_count_X__" are added to check if
EXISTS condition is satisfied. This column stores the count of how many
rows in the subquery are correlated to (joined to) each row of the main
query. When a base table contained in EXISTS clause is modified, this
count value in IMMV is updated, and a row whose count becomes zero
is deleted.
restrictions :
- EXISTS subqueries are allowed only in WHERE clause.
- aggregate functions are not supported together with EXISTS.
- EXISTS subqueries in a subquery are not supported.
- EXISTS condition can use only with AND Expr
Simple CTEs which does not contain aggregates or DISTINCT are
now supported similarly to simple sub-queries.
Before a view is maintained, all CTEs are converted to corresponding
subqueries to enable to treat CTEs as same as subqueries. For this
end, codes of the static function inline_cte in the core
(optimizer/plan/subselect.c) was imported.
Prohibit Unreferenced CTE is prohibited.
When a table in a unreferenced CTE is TRUNCATEd, the contents
of the IMMV is not affected so it must not be truncated. For
confirming it at the maintenance time, we have to check if the
modified table used in a CTE is actually referenced. Although
it would possible, we just disallow to create such IMMVs for now
since such unreferenced CTE is useless unless it doesn't contain
modifying commands, that is already prohibited.
Previously, IMMV including IMMV in its definition can be created by
create_immv(), but it should not be supported by IMMV because we
cannot maintain it recursively for now. This patch prevents it by raising
an error for such view definition on create_immv().
When multiple tables are updated or the view contains a self-join,
we need to calculate table states that was before it is modified
during incremental view maintenance. For get the pre-update state,
tuples inserted in a command must be removed when the table is scanned.
Previously, we used xmin and cmin system columns for this purpose,
but this way is problematic because after a tuple is frozen, its xmin
no longer has any meaning. Actually, we will get inconsistent view
results after XID wraparound.
Also, we can see similar similar inconsistency when using sub-transaction
because xmin values do not always monotonically increasing by command
executions.
To fix this, we use a snapshot that taken just before the table is
modified for checking tuple visibility in pre-state table instead of
using xmin and cmin system columns. A new function returning boolean,
ivm_visible_in_prestate, is added, and this is called in WHERE clause
of sub-queries to calculate pre-state table. This function check if a
specified tuple in the post-update table is visible or not using the
snapshot and return true if the tuple is visible.
Previously, a query string returned from pg_ivm_get_querydef
did not include column names specified when IMMV was defined
by create_immv. This caused failures in maintenance of MIN/MAX
aggregate views whose columns had alias names.
It is fixed by rewriting the result column name in the parse tree
using the view's tuple descriptor prior to calling pg_get_querydef
for PG15 or higher, or specifying the tuple descriptor to
get_query_def for PG14 or earlier.
In order to re-calculate min/max values for groups where the min
or max value is deleted, we need the view query definition in string
form. However, pg_get_viewdef cannot be used for this purpose because
IMMV's defenition is in pg_ivm_immv but not pg_rewrite. Therefore,
we have to convert query definition in pg_ivm_immv to query
definition string. We can use pg_get_querydef in PG15, but we cannot
in PG14 or earlier, so we use codes in ruleutil.c copied from PG13
or PG14 depending versions.
- Allow to use qualified name
- Confirm if executed by the owner of the IMMV
- Improve the message when specified relation is not an IMMV
- Create a unique index at refresh with no dat if possible
This is actually required, but we want it behave as same
as the pgsql-ivm version for now.
In PostgreSQL 14 or later, OIDs of aggregate functions are
described in fmrgoids.h, but that in PostgreSQL 13 doesn't
contain aggregate function OIDs. Therefore, we get the OID
by passing the function name and arg type to to_regprocedure().
Built-in count, sum, avg are supported. We have not supported min/max
yet, but will support in future.
When an IMMV with any aggregates are defined, additional columns
are created for each aggregate function. Although such columns are
"hidden" in pgsql-ivm version, they are always visible for users
in the extension version.
refresh_immv(immv_name, with_data) is a function to refresh IMMV like
REFRESH MATERIALIZED VIEW command. It has two argument.
immv_name is incrementally maintainable materialized view's name, and
with_data is an option that is corresponding to the WITH [NO] DATA option.
When with_data is set false, the IMMV gets unpopulated.
One of differences between IMMVs unpopulated by this function and
normal materialized views unpopulated by REFRESH ... WITH NO DATA
is that such IMMVs can be referenced by SELECT but return no rows,
while unpopulated materialized views are not scanable.
The behaviour may be changed in future to raise an error when unpopulated
an IMMV is scanned.