Previously, the view contents could become inconsistent with the base tables
in the following scenarios:
1) A concurrent transaction modifies a base table and commits before the
incremental view maintenance starts in the current transaction.
2) A concurrent transaction modifies a base table and commits before the
create_immv or refresh_immv command generates data.
3) Concurrent transactions incrementally update a view with a self-join
or modify multiple base tables simultaneously.
Incremental updates of a view are generally performed sequentially using an
exclusive lock. However, even if we are able to acquire the lock, a concurrent
transaction may have already incrementally updated the view and been committed
before we can acquire it. In REPEATABLE READ or SERIALIZABLE isolation levels,
this could lead to an inconsistent view state, which is the cause of the first
issue.
To fix this, a new field, lastivmupdate, has been added to the pg_ivm_immv
catalog to record the transaction ID of the most recent update to the view.
Before performing view maintenance, the transaction ID is checked. If the
transaction was still in progress at the start of the current transaction,
an error is raised to prevent anomalies.
To fix the second issue, the timing of CreateTrigger() has been moved to
before data generation. This ensures that locks conflicting with table
modifications have been acquired on all base tables. In addition, the latest
snapshot is used in READ COMMITTED level during the data generation to reflect
committed changes from concurrent transactions. Additionally, inconsistencies
that cannot be avoided through locking are prevented by checking the transaction
ID of the last view update, as done for the first issue.
However, concurrent table modifications and create_immv execution still cannot
be detected at the time of view creation. Therefore, create_immv raises a warning
in REPEATABLE READ or SERIALIZABLE isolation levels, suggesting that the command
be used in READ COMMITTED mode or that refresh_immv be executed afterward to
ensure the view remains consistent.
The third issue was caused by the snapshot used for checking tuple visibility in
the table's pre-update state not being the latest one. To fix this, the latest
snapshot is now used in READ COMMITTED mode.
Isolation tests are also added.
Issue #104
A RTE of modified table in the view definition query is substituted
by a subquery representing a delta table or a pre-update state table
during view maintenance. After this rewrite, Var that used to reference
the table column should become to references the corresponding column
in the subquery targetlist. Previously, the targetlist contained only
existing columns of the table. This was not a problem as long as the
table didn't have any dropped column because varattnos in the query
tree was identical to resno of the targetlist. However, if the table
has a dropped column, we cannot assume this correspondence, so an error
like the following occurred in that situation.
ERROR: could not find attribute 43 in subquery targetlist
To fix it, put "null" as a dummy value at the position in the targetlist
of a dropped column so that varattnos in the query tree is identical to
resno of the targetlist. We would also able to fix this by walking the
query tree to rewrite varattnos, but crafting targetlist is more simple
and reasonable.
Issue #85
Previously, pg_upgrade failed due to the permission denied
because the pg_ivm_immv catalog was in the pg_catalog catalog
(Issue #79). To fix this, all objects created by pg_ivm are
moved to theschema pgivm, which is also created by pg_ivm.
pg_ivm is still not relocatable and this must be installed
to the pgivm schema because the catalog and some internal
functions are referred to unqualified by the schema name
from the pg_ivm module. In future, this might be able to
relocatable during installation, though.
This commit affects compatibility with previous releases.
To allow to access objects like create_immv function as
previous, you need to qualify them with the schema name
or setup search_path properly.
Previously, a unique index is automatically created even when
a set-returning function is contained in FROM clause, but this
results in an error due to key duplication. Now, an index is
not created automatically when a relation other than table,
sub-query, or join is contained in FROM clause.
Issue (#99)
When pg_ivm is dropped, the error "could not open relation with OID ..." occurred
at the hook function that enables DROP TABLE on an IMMV to remove the entry in
pg_ivm_immv. It was because that the primary key was already dropped at the time
pg_ivm_immv's toast is been dropped. Also, DROP TABLE command issued concurrently
with DROP EXTENSION pg_Ivm also could cause the same error because pg_ivm_immv
could be already dropped.
This race condition is fixed by using always RangeVarGetRelidExtended to get OID of
pg_ivm_immv instead of using a cache of get_relname_relid results. This makes sure
that pg_ivm_immv exists when this is scanned.
The .gitignore file is copied from pg_bigm.
This doesn't contain files generated on Windows development environments for now,
but they will be added when anyone using Windows claim it.
Compilation errors and warning are fixed.
The design of create_immv is also chaned as similar to PG17, that is,
firstly a relation is created without data then it is populated
by using the refresh logic.
This commit contains the following changes:
- Change functions to use a safe search_path during maintenance operations
when used with PostgreSQL 17
This prevents maintenance operations (automatic maintenance of IMMVs and
refresh_immv) from performing unsafe access. Functions used by IMMVs that
need to reference non-default schemas must specify a search path during
function creation.
- refresh_immv can be executed by users with the MAINTAIN privilege
when used with PostgreSQL 17
Issue #90
This commit allows to use pg_ivm with the latest master branch code of
PostgreSQL. We may need additional fixes because PG17 is not released
yet, though.
EXISTS subquery is currently allowed only directly under WHERE clause
or in AND expression that is directly under WHERE. However, the check
was insufficient previously so that views using expressions other than
AND containing an EXISTS subquery could be created without an error
and it caused incorrect maintenance results.
To fix this check, add a new boolean member allow_context into
check_ivm_restriction_context. This member means whether EXISTS
subquery is allowed in the current node being examined in
check_ivm_restriction_walker. This should be set to true just before
calling check_ivm_restriction_walker for nodes directly under WHERE
or operands of AND expression direct under WHERE, and is reset to
false on every call of the function.
In passing, move the check for OR and NOT expression from
rewrite_exists_subquery_walker to check_ivm_restriction_context,
with some code cleaning.
---------
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Previously, DROP EXTENSION pg_ivm failed due to the failure of
opening the index on pg_ivm_immv in PgIvmObjectAccessHook that is
called on dropping pg_ivm_immv, because when pg_ivm_immv is being
dropped, the index on it is already dropped.
This is fixed to return immediately from the hook function if the
dropped table is pg_ivm_immv.
When pg_ivm is installed shared_preload_libraries without executing
CREATE EXTENSION command, the hook function is set while the catalog
table pg_ivm_immv is not created. In this case, the hook function
failed to open the catalog table and an error was raised.
Although this way of installing pg_ivm was not considered, it would
be nice to reduce any possible troubles on users. Therefore, to
prevent this error, check if PgIvmImmvRelationId is invalid before
open it. When this is invalid, pg_ivm_immv relation is not created
yet, so there are not any IMMVs, so we don't have to do anything
in this hook function.
Review and commit message by Yugo Nagata
In previous commit, maintenance of views using EXISTS and containing
duplicated tuples was fixed, but it was insufficient because it
raised an error during maintenance of views using both DISTINCT
and EXISTS. The cause was that tuples were tried to be duplicated
even when DISTINCT was specified. In this commit, it is fixed not
to duplicate tuples when DISTINCT is used.
When a tuple is inserted into a table in an EXISTS subquery,
the duplicity of row is computed by count(*), but it was not
considered and only one tuple was inserted with ignoring
the duplicity.
This is fixed by duplicating rows as much as the duplicity
by using generate_series at inserting.
(Issue #82)
Build errors/warnings against PostgreSQL 16 are fixed.
Also, adapted to the change of codes, including:
- Get rid of the "new" and "old" entries in a view's rangetable.
(Although, removed codes were dead codes because pg_ivm doesn't
have any rules in pg_rewrite.)
- Rework query relation permission checking
- Require empty Bitmapsets to be represented as NULL
- Fix some compiler warnings
Currently, types that does not have an equality operator cannot
be used in the target list of the view definition, because all
of target list entries are used for comparison to identify rows
to be updated or deleted in the view. Previously, an error is
raised at the time such view is incrementally maintained, this
is fixed to check that at the view creation time.
This restriction may be relaxed in future after we can use an
index to identify rows in the view.
Issue #61
Add EXISTS clause support in IVM
Correlated subqueries using EXISTS in WHERE clause are supported.
An EXISTS subquery in WHERE clause is rewritten to LATERAL subquery
in FROM clause, and IVM' process can handle this like as a normal join.
Also, hidden columns "ivm_exists_count_X__" are added to check if
EXISTS condition is satisfied. This column stores the count of how many
rows in the subquery are correlated to (joined to) each row of the main
query. When a base table contained in EXISTS clause is modified, this
count value in IMMV is updated, and a row whose count becomes zero
is deleted.
restrictions :
- EXISTS subqueries are allowed only in WHERE clause.
- aggregate functions are not supported together with EXISTS.
- EXISTS subqueries in a subquery are not supported.
- EXISTS condition can use only with AND Expr
Cached plans for recalculating min/max values are built using
pg_ivm_get_viewdef() that returns the view definition query text.
Therefore, if the search_path is changed, the query text is analyzed
again by SPI, and tables or functions in a wrong schema could be
referenced in the plan.
To fix this, we check whether the search_path is still the same
as when we made the cached plan and, if it isn't, we rebuild the
query text.
CVE-2023-23554
Previously, functions names in pg_catalog schema that were used during
view maintenance were not qualified. This is problematic because
functions in other schema could be referenced unintentionally. Moreover,
that could result in privilege escalation that if a nefarious user who
can create a function, arbitrary functions could be executed under IMMV
owner's privilege.
CVE-2023-23554
The view maintenance is performed under the view owner privilege.
If a modified table has a RLS policy, the policy must be applied
to relation for the pre-update-state table and the delta table
that contained inserted or deleted tuples. Previously, the security
quals were set to each ENR in a subquery that represents such
relation. However, the security check on the delta table was not
properly handled, and this caused that rows that must not be
accessed from the view owner could appear in the view contents
when the view was refreshed incrementally during a query containing
multiple types of commands, like a modifying CTE that contains
INSERT and UPDATE, or a MERGE command.
This patch fixes it by setting RLS policy to a subquery that
presents the pre-update-state table and the delta able instead of
to each RLS. Also, this change makes the code more simple and easy
to maintain.
CVE-2023-22847
Simple CTEs which does not contain aggregates or DISTINCT are
now supported similarly to simple sub-queries.
Before a view is maintained, all CTEs are converted to corresponding
subqueries to enable to treat CTEs as same as subqueries. For this
end, codes of the static function inline_cte in the core
(optimizer/plan/subselect.c) was imported.
Prohibit Unreferenced CTE is prohibited.
When a table in a unreferenced CTE is TRUNCATEd, the contents
of the IMMV is not affected so it must not be truncated. For
confirming it at the maintenance time, we have to check if the
modified table used in a CTE is actually referenced. Although
it would possible, we just disallow to create such IMMVs for now
since such unreferenced CTE is useless unless it doesn't contain
modifying commands, that is already prohibited.
It was added by 3de95c09 for consistency with the patch version
proposed for PostgreSQL core, but the called function was not
one we intended. Fortunately, the behavior has not been affected.
Now, I decided to remove this funciton since I come to think it is
confusable.
Also, added some comments on an argument left for the similar
consistency.
Some conditions on view definition that should be prohibited,
for example,
- SELECT ... FROM func(..., (SELECT ... FROM ...), ..) ...;
- SELECT expr(SELECT ... FROM ...) FROM ...;
were not checked. Also, some error messages and the order of
tests are improved.
Previously, it caused an error due to an ambiguous reference
at the maintenance time because generate_series is used internally.
This is fixed by using an alias name for the internal genearet_series.
When using DELETE or UPDATE, we must use exclusive lock for now
because apply_old_delta(_with_count) could result in wrong results
with concurrent transactions. We would like to improve it in future,
but we prevent it by using the lock for now.
The names of additional columns for aggregate views are derived from column
names specified in the first parameter of create_immv, but when the number
was less than the length of the actual target list, segmentation fault occurred.
Furthermore, when the number of specified columns is more than the target list,
it overrode additional column names and it caused a failure of incremental
maintenance of an aggregate view.
To fix then, check the length of the specified column name list not to access
invalid area, and also prevent from overriding additional column names.