pg_ivm

Author	SHA1	Message	Date
Yugo Nagata	f1166c0421	Fix potential view inconsistency issues (#121 ) Previously, the view contents could become inconsistent with the base tables in the following scenarios: 1) A concurrent transaction modifies a base table and commits before the incremental view maintenance starts in the current transaction. 2) A concurrent transaction modifies a base table and commits before the create_immv or refresh_immv command generates data. 3) Concurrent transactions incrementally update a view with a self-join or modify multiple base tables simultaneously. Incremental updates of a view are generally performed sequentially using an exclusive lock. However, even if we are able to acquire the lock, a concurrent transaction may have already incrementally updated the view and been committed before we can acquire it. In REPEATABLE READ or SERIALIZABLE isolation levels, this could lead to an inconsistent view state, which is the cause of the first issue. To fix this, a new field, lastivmupdate, has been added to the pg_ivm_immv catalog to record the transaction ID of the most recent update to the view. Before performing view maintenance, the transaction ID is checked. If the transaction was still in progress at the start of the current transaction, an error is raised to prevent anomalies. To fix the second issue, the timing of CreateTrigger() has been moved to before data generation. This ensures that locks conflicting with table modifications have been acquired on all base tables. In addition, the latest snapshot is used in READ COMMITTED level during the data generation to reflect committed changes from concurrent transactions. Additionally, inconsistencies that cannot be avoided through locking are prevented by checking the transaction ID of the last view update, as done for the first issue. However, concurrent table modifications and create_immv execution still cannot be detected at the time of view creation. Therefore, create_immv raises a warning in REPEATABLE READ or SERIALIZABLE isolation levels, suggesting that the command be used in READ COMMITTED mode or that refresh_immv be executed afterward to ensure the view remains consistent. The third issue was caused by the snapshot used for checking tuple visibility in the table's pre-update state not being the latest one. To fix this, the latest snapshot is now used in READ COMMITTED mode. Isolation tests are also added. Issue #104	2025-03-10 18:26:54 +09:00
Yugo Nagata	5b8b2f0a82	Fix targetlist of subquery substituting modified table (#117 ) A RTE of modified table in the view definition query is substituted by a subquery representing a delta table or a pre-update state table during view maintenance. After this rewrite, Var that used to reference the table column should become to references the corresponding column in the subquery targetlist. Previously, the targetlist contained only existing columns of the table. This was not a problem as long as the table didn't have any dropped column because varattnos in the query tree was identical to resno of the targetlist. However, if the table has a dropped column, we cannot assume this correspondence, so an error like the following occurred in that situation. ERROR: could not find attribute 43 in subquery targetlist To fix it, put "null" as a dummy value at the position in the targetlist of a dropped column so that varattnos in the query tree is identical to resno of the targetlist. We would also able to fix this by walking the query tree to rewrite varattnos, but crafting targetlist is more simple and reasonable. Issue #85	2025-02-20 12:58:03 +09:00
Yugo Nagata	417c291454	Change the schema from pg_catalog to pgivm (#116 ) Previously, pg_upgrade failed due to the permission denied because the pg_ivm_immv catalog was in the pg_catalog catalog (Issue #79). To fix this, all objects created by pg_ivm are moved to theschema pgivm, which is also created by pg_ivm. pg_ivm is still not relocatable and this must be installed to the pgivm schema because the catalog and some internal functions are referred to unqualified by the schema name from the pg_ivm module. In future, this might be able to relocatable during installation, though. This commit affects compatibility with previous releases. To allow to access objects like create_immv function as previous, you need to qualify them with the schema name or setup search_path properly.	2025-02-17 12:07:21 +09:00
Yongtao Huang	094add99f5	Clean duplicated code and some typos (#112 ) Also, argument order mismatch of apply_new_delta_with_count() between declaration and definition is fixed.	2025-02-12 15:30:46 +09:00
Yugo Nagata	65a2d36b22	Add support for PostgreSQL 17 (#92 ) Compilation errors and warning are fixed. The design of create_immv is also chaned as similar to PG17, that is, firstly a relation is created without data then it is populated by using the refresh logic. This commit contains the following changes: - Change functions to use a safe search_path during maintenance operations when used with PostgreSQL 17 This prevents maintenance operations (automatic maintenance of IMMVs and refresh_immv) from performing unsafe access. Functions used by IMMVs that need to reference non-default schemas must specify a search path during function creation. - refresh_immv can be executed by users with the MAINTAIN privilege when used with PostgreSQL 17 Issue #90	2024-07-31 12:37:43 +09:00
reshke	76888cd975	Fix compile with PG17 (#76 ) Support for PostgreSQL17 Rename OverrideSearchPath to SearchPathMatcher from PG17.	2024-03-25 09:25:27 +09:00
thoshiai	fc2339e16c	Fix checking for expressions containing an EXISTS subquery (#71 ) EXISTS subquery is currently allowed only directly under WHERE clause or in AND expression that is directly under WHERE. However, the check was insufficient previously so that views using expressions other than AND containing an EXISTS subquery could be created without an error and it caused incorrect maintenance results. To fix this check, add a new boolean member allow_context into check_ivm_restriction_context. This member means whether EXISTS subquery is allowed in the current node being examined in check_ivm_restriction_walker. This should be set to true just before calling check_ivm_restriction_walker for nodes directly under WHERE or operands of AND expression direct under WHERE, and is reset to false on every call of the function. In passing, move the check for OR and NOT expression from rewrite_exists_subquery_walker to check_ivm_restriction_context, with some code cleaning. --------- Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>	2024-03-01 20:33:56 +09:00
Yugo Nagata	01f0ea0eb1	Fix for view using both DISTINCT and EXISTS In previous commit, maintenance of views using EXISTS and containing duplicated tuples was fixed, but it was insufficient because it raised an error during maintenance of views using both DISTINCT and EXISTS. The cause was that tuples were tried to be duplicated even when DISTINCT was specified. In this commit, it is fixed not to duplicate tuples when DISTINCT is used.	2024-02-29 17:55:55 +09:00
Yugo Nagata	8c3b4ba9d7	Consider tuple duplicity in maintenance of EXISTS views When a tuple is inserted into a table in an EXISTS subquery, the duplicity of row is computed by count(*), but it was not considered and only one tuple was inserted with ignoring the duplicity. This is fixed by duplicating rows as much as the duplicity by using generate_series at inserting. (Issue #82)	2024-02-29 17:55:55 +09:00
Yugo Nagata	71f9d268b0	Add support for PostgreSQL 16 (#69 ) (#70 ) Build errors/warnings against PostgreSQL 16 are fixed. Also, adapted to the change of codes, including: - Get rid of the "new" and "old" entries in a view's rangetable. (Although, removed codes were dead codes because pg_ivm doesn't have any rules in pg_rewrite.) - Rework query relation permission checking - Require empty Bitmapsets to be represented as NULL - Fix some compiler warnings	2023-09-11 15:23:51 +09:00
thoshiai	c355f4003b	Support exists_subquery (#53 ) Add EXISTS clause support in IVM Correlated subqueries using EXISTS in WHERE clause are supported. An EXISTS subquery in WHERE clause is rewritten to LATERAL subquery in FROM clause, and IVM' process can handle this like as a normal join. Also, hidden columns "ivm_exists_count_X__" are added to check if EXISTS condition is satisfied. This column stores the count of how many rows in the subquery are correlated to (joined to) each row of the main query. When a base table contained in EXISTS clause is modified, this count value in IMMV is updated, and a row whose count becomes zero is deleted. restrictions : - EXISTS subqueries are allowed only in WHERE clause. - aggregate functions are not supported together with EXISTS. - EXISTS subqueries in a subquery are not supported. - EXISTS condition can use only with AND Expr	2023-08-31 11:51:36 +09:00
Yugo Nagata	aaaa6cff66	Rebuild the query for recalculating min/max after search_path change Cached plans for recalculating min/max values are built using pg_ivm_get_viewdef() that returns the view definition query text. Therefore, if the search_path is changed, the query text is analyzed again by SPI, and tables or functions in a wrong schema could be referenced in the plan. To fix this, we check whether the search_path is still the same as when we made the cached plan and, if it isn't, we rebuild the query text. CVE-2023-23554	2023-03-02 14:44:05 +09:00
Yugo Nagata	14bb84c599	Fix to use qualified function names during maintenance Previously, functions names in pg_catalog schema that were used during view maintenance were not qualified. This is problematic because functions in other schema could be referenced unintentionally. Moreover, that could result in privilege escalation that if a nefarious user who can create a function, arbitrary functions could be executed under IMMV owner's privilege. CVE-2023-23554	2023-03-02 14:43:24 +09:00
Yugo Nagata	99a176ab97	Fix row level security checks during view maintenance The view maintenance is performed under the view owner privilege. If a modified table has a RLS policy, the policy must be applied to relation for the pre-update-state table and the delta table that contained inserted or deleted tuples. Previously, the security quals were set to each ENR in a subquery that represents such relation. However, the security check on the delta table was not properly handled, and this caused that rows that must not be accessed from the view owner could appear in the view contents when the view was refreshed incrementally during a query containing multiple types of commands, like a modifying CTE that contains INSERT and UPDATE, or a MERGE command. This patch fixes it by setting RLS policy to a subquery that presents the pre-update-state table and the delta able instead of to each RLS. Also, this change makes the code more simple and easy to maintain. CVE-2023-22847	2023-03-02 14:42:37 +09:00
Yugo Nagata	b928e32774	Add CTE support (#47 ) Simple CTEs which does not contain aggregates or DISTINCT are now supported similarly to simple sub-queries. Before a view is maintained, all CTEs are converted to corresponding subqueries to enable to treat CTEs as same as subqueries. For this end, codes of the static function inline_cte in the core (optimizer/plan/subselect.c) was imported. Prohibit Unreferenced CTE is prohibited. When a table in a unreferenced CTE is TRUNCATEd, the contents of the IMMV is not affected so it must not be truncated. For confirming it at the maintenance time, we have to check if the modified table used in a CTE is actually referenced. Although it would possible, we just disallow to create such IMMVs for now since such unreferenced CTE is useless unless it doesn't contain modifying commands, that is already prohibited.	2023-01-30 11:28:27 +09:00
Yugo Nagata	7997f3e260	Remove a unnecessary function call It was added by `3de95c09` for consistency with the patch version proposed for PostgreSQL core, but the called function was not one we intended. Fortunately, the behavior has not been affected. Now, I decided to remove this funciton since I come to think it is confusable. Also, added some comments on an argument left for the similar consistency.	2023-01-27 15:37:11 +09:00
Yugo Nagata	e00ff9dd32	Fix to allow generate_sereis in FROM clause (#50 ) Previously, it caused an error due to an ambiguous reference at the maintenance time because generate_series is used internally. This is fixed by using an alias name for the internal genearet_series.	2023-01-26 16:58:20 +09:00
Yugo Nagata	b6702f9a3a	Use snapshot to check tuple visibility in pre-update state (#28 ) When multiple tables are updated or the view contains a self-join, we need to calculate table states that was before it is modified during incremental view maintenance. For get the pre-update state, tuples inserted in a command must be removed when the table is scanned. Previously, we used xmin and cmin system columns for this purpose, but this way is problematic because after a tuple is frozen, its xmin no longer has any meaning. Actually, we will get inconsistent view results after XID wraparound. Also, we can see similar similar inconsistency when using sub-transaction because xmin values do not always monotonically increasing by command executions. To fix this, we use a snapshot that taken just before the table is modified for checking tuple visibility in pre-state table instead of using xmin and cmin system columns. A new function returning boolean, ivm_visible_in_prestate, is added, and this is called in WHERE clause of sub-queries to calculate pre-state table. This function check if a specified tuple in the post-update table is visible or not using the snapshot and return true if the tuple is visible.	2022-09-30 11:53:12 +09:00
Yugo Nagata	4c6016999d	Fix to get view definition string with correct column names (#26 ) Previously, a query string returned from pg_ivm_get_querydef did not include column names specified when IMMV was defined by create_immv. This caused failures in maintenance of MIN/MAX aggregate views whose columns had alias names. It is fixed by rewriting the result column name in the parse tree using the view's tuple descriptor prior to calling pg_get_querydef for PG15 or higher, or specifying the tuple descriptor to get_query_def for PG14 or earlier.	2022-09-29 22:24:28 +09:00
Marco Slot	046b323ce5	Fix compiler warning in ExecRefreshImmv	2022-07-28 18:05:39 +02:00
Yugo Nagata	6faf0b3baa	Support min/max aggregates (#18 ) In order to re-calculate min/max values for groups where the min or max value is deleted, we need the view query definition in string form. However, pg_get_viewdef cannot be used for this purpose because IMMV's defenition is in pg_ivm_immv but not pg_rewrite. Therefore, we have to convert query definition in pg_ivm_immv to query definition string. We can use pg_get_querydef in PG15, but we cannot in PG14 or earlier, so we use codes in ruleutil.c copied from PG13 or PG14 depending versions.	2022-07-25 13:11:33 +09:00
thoshiai	790b0d2bd6	Support simple subquery (#17 ) A simple subquery in FROM clause is supported. DISTINCT and aggregate functions are not supported in subquery.	2022-07-25 09:24:51 +09:00
thoshiai	2ead2e207e	Add support for PostgreSQL 15 (#15 )	2022-07-14 22:54:33 +09:00
Yugo Nagata	3de95c09fa	Improve refresh_immv behavior a bit - Allow to use qualified name - Confirm if executed by the owner of the IMMV - Improve the message when specified relation is not an IMMV - Create a unique index at refresh with no dat if possible This is actually required, but we want it behave as same as the pgsql-ivm version for now.	2022-06-23 11:33:06 +09:00
Yugo Nagata	d99aeb848e	Allow TRUNCATE on base tables (#144 ) When a base table is truncated, the view content will be empty if the view definition query does not contain an aggregate without a GROUP clause. Therefore, such views can be truncated. Aggregate views without a GROUP clause always have one row. Therefore, if a base table is truncated, the view will not be empty and will contain a row with NULL value (or 0 for count()). So, in this case, we refresh the view instead of truncating it.	2022-06-21 21:27:21 +09:00
Yugo Nagata	57c8bac1a0	Add aggregates support Built-in count, sum, avg are supported. We have not supported min/max yet, but will support in future. When an IMMV with any aggregates are defined, additional columns are created for each aggregate function. Although such columns are "hidden" in pgsql-ivm version, they are always visible for users in the extension version.	2022-06-21 20:50:45 +09:00
thoshiai	51a944b388	Add refresh_immv() function refresh_immv(immv_name, with_data) is a function to refresh IMMV like REFRESH MATERIALIZED VIEW command. It has two argument. immv_name is incrementally maintainable materialized view's name, and with_data is an option that is corresponding to the WITH [NO] DATA option. When with_data is set false, the IMMV gets unpopulated. One of differences between IMMVs unpopulated by this function and normal materialized views unpopulated by REFRESH ... WITH NO DATA is that such IMMVs can be referenced by SELECT but return no rows, while unpopulated materialized views are not scanable. The behaviour may be changed in future to raise an error when unpopulated an IMMV is scanned.	2022-06-16 03:06:47 +09:00
Tatsuo Ishii	612b59694e	Allow to build pg_ivm on PostgreSQL 13.	2022-05-17 21:32:04 +09:00
Yugo Nagata	eed6271128	Split files to make it easier to follow the core code	2022-04-27 14:45:47 +09:00

29 commits